AI/TLDR

GLM-4.5

Z.ai's open-weight, agent-native flagship: a 355B/32B-active MoE that unifies reasoning, coding and tool use under an MIT license.

Overview

GLM-4.5 is the flagship large language model released by Z.ai (formerly Zhipu AI) on 28 July 2025. It is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass, shipped under a permissive MIT license with open weights — base, hybrid (thinking/non-thinking) and FP8 variants are all available for commercial use and fine-tuning. A smaller GLM-4.5-Air sibling (106B total / 12B active) targets cheaper, lighter deployment.

GLM-4.5 was designed as an agent-native model: a single set of weights unifies reasoning, coding and tool use rather than splitting them across separate models. Z.ai reports a tool-calling success rate of 90.6% and positions the model for software engineering, web browsing and front-end development. A hybrid reasoning mode lets callers switch between a deliberate thinking mode for hard tasks and a fast non-thinking mode for immediate answers.

On Z.ai's own aggregate of 12 representative benchmarks, GLM-4.5 scored an average of 63.2, which the company reported as third overall and the top score among open-source models at launch. It supports a 128K-token context window and up to 96K output tokens, and is served through the Z.ai API as well as third-party hosts such as OpenRouter.

Released2025-07-28
LicenseMIT
WeightsOpen weights
Parameters355B total / 32B active (MoE)
Context128K
Max output96K
ArchitectureMixture-of-Experts (MoE) with 355B total and 32B active parameters, using a deeper (more-layers) rather than wider design and Multi-Token Prediction (MTP) layers. Pre-trained on roughly 23 trillion tokens. It is a hybrid-reasoning model offering a "thinking" mode for complex reasoning and tool use and a "non-thinking" mode for fast direct replies.
ModalitiesText
StatusAvailable

Benchmarks

  1. MMLU-Pro84.6%
  2. AIME 202491%
  3. SWE-bench Verified64.2%
  4. TAU-bench70.1%
  5. BrowseComp26.4%
  6. Tool-calling success rate90.6%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.60 per 1M tokens
Output$2.20 per 1M tokens

Standard Z.ai API list price for GLM-4.5 (the GLM-4.5-Air variant is cheaper at $0.20 input / $1.10 output). Z.ai ran a lower promotional launch rate; third-party hosts may differ.

Pricing source ↗

Strengths

  • Open weights under a permissive MIT license — free for commercial use, fine-tuning and on-prem deployment
  • Agent-native design with a high reported tool-calling success rate (90.6%)
  • Strong agentic and coding scores (SWE-bench Verified 64.2, TAU-bench 70.1) for an open model
  • Hybrid thinking / non-thinking modes in one model, so you trade off depth vs. latency per request
  • Aggressive pricing relative to closed frontier models, plus a lighter GLM-4.5-Air variant

Best for

  • Autonomous and tool-using agents (function calling, web browsing, multi-step workflows)
  • Software engineering and front-end development assistance
  • Math and competition-style reasoning tasks
  • Self-hosted or fine-tuned deployments where open weights and an MIT license are required
  • Cost-sensitive production workloads that still need frontier-class reasoning

How to access

ProviderModel ID
Z.ai ↗glm-4.5
OpenRouter ↗z-ai/glm-4.5
Hugging Face ↗zai-org/GLM-4.5

GLM (flagship) — every version

The full lineage of the GLM (flagship) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
GLM-5.2current2026-06-131MMIT
GLM-5.12026-04-07Open weights
GLM-52026-02-11Apache-2.0
GLM-4.72025-12-22Open weights
GLM-4.62025-09-30MIT
GLM-4.52025-07-28MIT

FAQ

Is GLM-4.5 open source?

Yes. GLM-4.5 is released under the permissive MIT license with open weights — including base, hybrid thinking/non-thinking and FP8 variants — so it can be used commercially, fine-tuned and self-hosted.

How big is GLM-4.5?

GLM-4.5 is a Mixture-of-Experts model with 355 billion total parameters and 32 billion active per forward pass. A smaller GLM-4.5-Air variant has 106 billion total and 12 billion active parameters.

What is GLM-4.5 best at?

It is an agent-native model built to unify reasoning, coding and tool use. Z.ai reports a 90.6% tool-calling success rate, plus strong agentic and coding scores such as 64.2 on SWE-bench Verified and 70.1 on TAU-bench.

How much does the GLM-4.5 API cost?

The standard Z.ai list price for GLM-4.5 is about $0.60 per million input tokens and $2.20 per million output tokens. The lighter GLM-4.5-Air variant is cheaper at roughly $0.20 input / $1.10 output.