GLM-5-Turbo

Name: GLM-5-Turbo
Author: Z.ai (Zhipu / GLM)

Z.ai's proprietary, API-only GLM-5 variant tuned for fast, high-throughput agent workflows.

Overview

GLM-5-Turbo is a large language model from Z.ai (Zhipu / GLM), released on March 15, 2026 as part of the GLM Turbo line. Unlike the open-weight GLM-5 flagship, GLM-5-Turbo is a proprietary, closed-source companion model that is only available through Z.ai's hosted API and partner platforms — there is no weights download.

GLM-5-Turbo is a reasoning model that uses extended chain-of-thought before answering, and it is tuned specifically for high-throughput agentic workloads. Z.ai's release notes describe it as focused on stability and efficiency in long-chain agent tasks: stronger tool and skills integration, better decomposition of complex instructions, and more consistent execution across multi-step, multi-agent workflows.

It serves text in and text out (it is not multimodal), accepts a long context window of up to 262,144 tokens, and can return up to 131,072 tokens in a single response. On the Artificial Analysis Intelligence Index it scores 38, placing GLM-5-Turbo among the stronger models tracked there at the time of release.

Released	2026-03-15
License	Proprietary
Weights	API only
Parameters	~744B total / ~40B active (MoE, shared GLM-5 base)
Context	262K
Max output	128K
Architecture	Mixture-of-Experts built on the GLM-5 base (~744B total parameters, ~40B active) with DeepSeek Sparse Attention (DSA). Served as a throughput-optimised, closed API variant rather than a separately released checkpoint.
Modalities	Text
Status	Available

Benchmarks

Artificial Analysis Intelligence Index38%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$1.20 / 1M tokens per 1M tokens
Output	$4.00 / 1M tokens per 1M tokens

Pricing source ↗

Strengths

Tuned for high-throughput, long-chain agent tasks with improved stability over many steps
Strong tool and skills integration plus reliable decomposition of complex, multi-step instructions
Long 262K-token context window with up to 128K-token outputs for large agent workloads
Built on the capable GLM-5 MoE base (~744B total / ~40B active params)
Competitive Artificial Analysis Intelligence Index score (38) for its price tier

Best for

Autonomous and multi-agent systems that run long tool-calling chains
Agentic coding and engineering assistants that decompose multi-step tasks
High-volume API workloads where throughput and stability matter
Long-document and long-context reasoning over large inputs
Tool-augmented assistants that browse, call functions, and orchestrate skills

How to access

Provider	Model ID
Z.ai ↗	`glm-5-turbo`
OpenRouter ↗	`z-ai/glm-5-turbo`

GLM Turbo — every version

The full lineage of the GLM Turbo line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
GLM-5V-Turbocurrent	2026-04-01	—	Open weights
GLM-5-Turbo	2026-03-15	—	Open weights

FAQ

Is GLM-5-Turbo open source or open weights?

No. Unlike the GLM-5 flagship, which Z.ai released with open weights, GLM-5-Turbo is a proprietary, closed-source companion model. It is available only through Z.ai's hosted API and partner platforms such as OpenRouter — there is no weights download.

What is GLM-5-Turbo best at?

It is optimised for high-throughput agentic workloads: long tool-calling chains, multi-step task decomposition, and multi-agent coordination. Z.ai positions it for fast inference and stability across extended agent runs rather than as a maximal-quality model.

What is GLM-5-Turbo's context window and pricing?

GLM-5-Turbo supports up to a 262K-token context window and up to 128K tokens of output. On OpenRouter it is priced at about $1.20 per million input tokens and $4.00 per million output tokens.

Is GLM-5-Turbo multimodal?

No. GLM-5-Turbo is text-in, text-out only. For vision tasks, Z.ai offers a separate GLM-5V-Turbo multimodal variant.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// GLM Turbo — every version

// FAQ