GLM-4.5

Z.ai's open-weight, agent-native flagship: a 355B/32B-active MoE that unifies reasoning, coding and tool use under an MIT license.

Overview

GLM-4.5 is the flagship large language model released by Z.ai (formerly Zhipu AI) on 28 July 2025. It is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass, shipped under a permissive MIT license with open weights — base, hybrid (thinking/non-thinking) and FP8 variants are all available for commercial use and fine-tuning. A smaller GLM-4.5-Air sibling (106B total / 12B active) targets cheaper, lighter deployment.

GLM-4.5 was designed as an agent-native model: a single set of weights unifies reasoning, coding and tool use rather than splitting them across separate models. Z.ai reports a tool-calling success rate of 90.6% and positions the model for software engineering, web browsing and front-end development. A hybrid reasoning mode lets callers switch between a deliberate thinking mode for hard tasks and a fast non-thinking mode for immediate answers.

On Z.ai's own aggregate of 12 representative benchmarks, GLM-4.5 scored an average of 63.2, which the company reported as third overall and the top score among open-source models at launch. It supports a 128K-token context window and up to 96K output tokens, and is served through the Z.ai API as well as third-party hosts such as OpenRouter.

Released	2025-07-28
License	MIT
Weights	Open weights
Parameters	355B total / 32B active (MoE)
Context	128K
Max output	96K
Architecture	Mixture-of-Experts (MoE) with 355B total and 32B active parameters, using a deeper (more-layers) rather than wider design and Multi-Token Prediction (MTP) layers. Pre-trained on roughly 23 trillion tokens. It is a hybrid-reasoning model offering a "thinking" mode for complex reasoning and tool use and a "non-thinking" mode for fast direct replies.
Modalities	Text
Status	Available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.60 per 1M tokens
Output	$2.20 per 1M tokens

Standard Z.ai API list price for GLM-4.5 (the GLM-4.5-Air variant is cheaper at $0.20 input / $1.10 output). Z.ai ran a lower promotional launch rate; third-party hosts may differ.

Pricing source ↗

Strengths

Open weights under a permissive MIT license — free for commercial use, fine-tuning and on-prem deployment
Agent-native design with a high reported tool-calling success rate (90.6%)
Strong agentic and coding scores (SWE-bench Verified 64.2, TAU-bench 70.1) for an open model
Hybrid thinking / non-thinking modes in one model, so you trade off depth vs. latency per request
Aggressive pricing relative to closed frontier models, plus a lighter GLM-4.5-Air variant

Best for

Autonomous and tool-using agents (function calling, web browsing, multi-step workflows)
Software engineering and front-end development assistance
Math and competition-style reasoning tasks
Self-hosted or fine-tuned deployments where open weights and an MIT license are required
Cost-sensitive production workloads that still need frontier-class reasoning

How to access

Provider	Model ID
Z.ai ↗	`glm-4.5`
OpenRouter ↗	`z-ai/glm-4.5`
Hugging Face ↗	`zai-org/GLM-4.5`

GLM (flagship) — every version

The full lineage of the GLM (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
GLM-5.2current	2026-06-13	1M	MIT
GLM-5.1	2026-04-07	—	Open weights
GLM-5	2026-02-11	—	Apache-2.0
GLM-4.7	2025-12-22	—	Open weights
GLM-4.6	2025-09-30	—	MIT
GLM-4.5	2025-07-28	—	MIT

FAQ

Is GLM-4.5 open source?

Yes. GLM-4.5 is released under the permissive MIT license with open weights — including base, hybrid thinking/non-thinking and FP8 variants — so it can be used commercially, fine-tuned and self-hosted.

How big is GLM-4.5?

GLM-4.5 is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass. A smaller GLM-4.5-Air variant has 106 billion total and 12 billion active parameters.

What is GLM-4.5 best at?

It is an agent-native model built to unify reasoning, coding and tool use. Z.ai reports a 90.6% tool-calling success rate, plus strong agentic and coding scores such as 64.2 on SWE-bench Verified and 70.1 on TAU-bench.

How much does the GLM-4.5 API cost?

The standard Z.ai list price for GLM-4.5 is about $0.60 per million input tokens and $2.20 per million output tokens. The lighter GLM-4.5-Air variant is cheaper at roughly $0.20 input / $1.10 output.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// GLM (flagship) — every version

// FAQ