Overview
GLM-4-9B is the open 9-billion-parameter model in Zhipu AI's GLM-4 family (the lab now operates as Z.ai), released on 2024-06-05 by Team GLM at Tsinghua University and published under the THUDM organization on Hugging Face. It succeeds the earlier ChatGLM line and ships in several forms: the GLM-4-9B base model, the human-preference-aligned GLM-4-9B-Chat, a long-context GLM-4-9B-Chat-1M variant, and the separate GLM-4V-9B vision model.
GLM-4-9B-Chat is the flagship of the line. It supports a 128K-token context window (with a 1M-token variant available), built-in tool calling, web browsing, and code execution, and works across 26 languages including Japanese, Korean and German. According to its model card and the ChatGLM technical report, GLM-4-9B and GLM-4-9B-Chat outperform Llama-3-8B across semantics, math, reasoning, code and knowledge evaluations.
Because the weights are open, GLM-4-9B can be run locally or self-hosted, and it is also served free of charge through aggregators such as OpenRouter. The detailed methodology behind the family is documented in the paper 'ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools' (arXiv:2406.12793).
| Released | 2024-06-05 |
|---|---|
| License | GLM-4 License (custom open weights) |
| Weights | Open weights |
| Parameters | 9B |
| Context | 128K |
| Max output | Not publicly disclosed |
| Architecture | Dense transformer (GLM-4 architecture) |
| Knowledge cutoff | Not publicly disclosed |
| Modalities | Text |
| Status | Available (open weights) |
Benchmarks
- MMLU (GLM-4-9B-Chat)72.4%
- C-Eval (GLM-4-9B-Chat)75.6%
- GSM8K (GLM-4-9B-Chat)79.6%
- MATH (GLM-4-9B-Chat)50.6%
- HumanEval (GLM-4-9B-Chat)71.8%
- IFEval (GLM-4-9B-Chat)69%
- AlignBench-v2 (GLM-4-9B-Chat)6.61%
- MT-Bench (GLM-4-9B-Chat)8.35%
- MMLU (GLM-4-9B base)74.7%
- C-Eval (GLM-4-9B base)77.1%
- GSM8K (GLM-4-9B base)84%
- GPQA (GLM-4-9B base)34.3%
- HumanEval (GLM-4-9B base)70.1%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | Free / 1M tokens |
|---|---|
| Output | Free / 1M tokens |
Served free of charge via OpenRouter (thudm/glm-4-9b:free). Open weights can also be self-hosted at no licensing cost.
Strengths
- Strong capability-per-parameter for a 9B model — beats Llama-3-8B on the lab's reported semantics, math, reasoning, code and knowledge evals
- 128K context in GLM-4-9B-Chat, with a dedicated 1M-token long-context variant
- Native tool calling, web browsing and code execution support
- Multilingual across 26 languages (incl. Japanese, Korean, German)
- Open weights — runs locally / self-hosted, small enough for single-GPU or on-device use
- Available free through OpenRouter for low-cost experimentation
Best for
- Local and on-device chat assistants where a small open model is preferred
- Long-document analysis using the 128K / 1M context variants
- Tool-using and function-calling agents on a self-hosted model
- Multilingual chat and translation across 26 languages
- Research and fine-tuning on a permissively distributed open-weight base
- Cost-sensitive prototyping via the free OpenRouter endpoint
How to access
| Provider | Model ID |
|---|---|
| OpenRouter ↗ | thudm/glm-4-9b:free |
| Hugging Face ↗ | THUDM/glm-4-9b-chat |
FAQ
Is GLM-4-9B open source?
The weights are openly released by Team GLM (THUDM / Zhipu) on Hugging Face under the custom GLM-4 License, so you can download, run and fine-tune the model locally. Use of the weights must comply with the GLM-4 LICENSE terms in the repository.
What context length does GLM-4-9B support?
GLM-4-9B-Chat supports up to 128K tokens. There is also a dedicated GLM-4-9B-Chat-1M variant that extends the context to 1M tokens. The plain base model uses a shorter window.
How does GLM-4-9B compare to Llama-3-8B?
Per its model card and the ChatGLM technical report, both GLM-4-9B and the aligned GLM-4-9B-Chat outperform Llama-3-8B across semantics, math, reasoning, code and knowledge evaluations. GLM-4-9B-Chat scores 72.4 on MMLU, 79.6 on GSM8K, 50.6 on MATH and 71.8 on HumanEval.
Can GLM-4-9B handle images?
The text GLM-4-9B model is text-only. For vision, the family includes a separate multimodal model, GLM-4V-9B, built on the same 9B base, which handles image understanding.