Overview
GLM-4.5 is the flagship large language model released by Z.ai (formerly Zhipu AI) on 28 July 2025. It is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass, shipped under a permissive MIT license with open weights — base, hybrid (thinking/non-thinking) and FP8 variants are all available for commercial use and fine-tuning. A smaller GLM-4.5-Air sibling (106B total / 12B active) targets cheaper, lighter deployment.
GLM-4.5 was designed as an agent-native model: a single set of weights unifies reasoning, coding and tool use rather than splitting them across separate models. Z.ai reports a tool-calling success rate of 90.6% and positions the model for software engineering, web browsing and front-end development. A hybrid reasoning mode lets callers switch between a deliberate thinking mode for hard tasks and a fast non-thinking mode for immediate answers.
On Z.ai's own aggregate of 12 representative benchmarks, GLM-4.5 scored an average of 63.2, which the company reported as third overall and the top score among open-source models at launch. It supports a 128K-token context window and up to 96K output tokens, and is served through the Z.ai API as well as third-party hosts such as OpenRouter.
| Released | 2025-07-28 |
|---|---|
| License | MIT |
| Weights | Open weights |
| Parameters | 355B total / 32B active (MoE) |
| Context | 128K |
| Max output | 96K |
| Architecture | Mixture-of-Experts (MoE) with 355B total and 32B active parameters, using a deeper (more-layers) rather than wider design and Multi-Token Prediction (MTP) layers. Pre-trained on roughly 23 trillion tokens. It is a hybrid-reasoning model offering a "thinking" mode for complex reasoning and tool use and a "non-thinking" mode for fast direct replies. |
| Modalities | Text |
| Status | Available |
Benchmarks
- MMLU-Pro84.6%
- AIME 202491%
- SWE-bench Verified64.2%
- TAU-bench70.1%
- BrowseComp26.4%
- Tool-calling success rate90.6%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.60 per 1M tokens |
|---|---|
| Output | $2.20 per 1M tokens |
Standard Z.ai API list price for GLM-4.5 (the GLM-4.5-Air variant is cheaper at $0.20 input / $1.10 output). Z.ai ran a lower promotional launch rate; third-party hosts may differ.
Strengths
- Open weights under a permissive MIT license — free for commercial use, fine-tuning and on-prem deployment
- Agent-native design with a high reported tool-calling success rate (90.6%)
- Strong agentic and coding scores (SWE-bench Verified 64.2, TAU-bench 70.1) for an open model
- Hybrid thinking / non-thinking modes in one model, so you trade off depth vs. latency per request
- Aggressive pricing relative to closed frontier models, plus a lighter GLM-4.5-Air variant
Best for
- Autonomous and tool-using agents (function calling, web browsing, multi-step workflows)
- Software engineering and front-end development assistance
- Math and competition-style reasoning tasks
- Self-hosted or fine-tuned deployments where open weights and an MIT license are required
- Cost-sensitive production workloads that still need frontier-class reasoning
How to access
| Provider | Model ID |
|---|---|
| Z.ai ↗ | glm-4.5 |
| OpenRouter ↗ | z-ai/glm-4.5 |
| Hugging Face ↗ | zai-org/GLM-4.5 |
GLM (flagship) — every version
The full lineage of the GLM (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
FAQ
Is GLM-4.5 open source?
Yes. GLM-4.5 is released under the permissive MIT license with open weights — including base, hybrid thinking/non-thinking and FP8 variants — so it can be used commercially, fine-tuned and self-hosted.
How big is GLM-4.5?
GLM-4.5 is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass. A smaller GLM-4.5-Air variant has 106 billion total and 12 billion active parameters.
What is GLM-4.5 best at?
It is an agent-native model built to unify reasoning, coding and tool use. Z.ai reports a 90.6% tool-calling success rate, plus strong agentic and coding scores such as 64.2 on SWE-bench Verified and 70.1 on TAU-bench.
How much does the GLM-4.5 API cost?
The standard Z.ai list price for GLM-4.5 is about $0.60 per million input tokens and $2.20 per million output tokens. The lighter GLM-4.5-Air variant is cheaper at roughly $0.20 input / $1.10 output.