Hugging Face · 2026-07-03 · major
Hugging Face Transformers 5.13.0 — nine new architectures and unified HfExporter
Hugging Face Transformers 5.13.0 lands nine new model architectures — including Kimi K2.5–K2.7, MiMo-V2-Flash, Zyphra ZAYA, VideoPrism, RADIO, and MiniCPM3 — and introduces HfExporter, a unified export API covering PyTorch, ONNX, and ExecuTorch.
The reference AI library gains nine architectures in one drop plus one export command for every runtime.
Quick facts
| Maker | Hugging Face |
|---|---|
| Version | 5.13.0 |
| Released | 2026-07-03 |
| License | Apache-2.0 |
| New architectures | 9 (Kimi K2.5–K2.7, MiMo-V2-Flash, Nemotron 3.5 ASR, Qwen3 ASR, ZAYA, VideoPrism, RADIO, MiniCPM3) |
| New feature | HfExporter — unified PyTorch, ONNX, ExecuTorch export |
| GitHub stars | 162K |
What is it?
Hugging Face Transformers 5.13.0 is the July 3 release of the reference Python library that defines how thousands of open-weight models load, run, and export. This drop widens the supported-model roster by nine architectures and folds three previously separate exporters into a single HfExporter API.
How does it work?
On the model side, Transformers 5.13.0 adds first-class classes for Kimi K2.5–K2.7, MiMo-V2-Flash, ZAYA, VideoPrism, RADIO, MiniCPM3, and two Nemotron / Qwen3 speech models. On the runtime side, HfExporter routes a single call through DynamoExporter, OnnxExporter, or ExecutorchExporter, with dynamic shapes on by default and prefill/decode splitting handled automatically.
Why does it matter?
Because Transformers 5.13.0 is what downstream frameworks and inference stacks build on, native-class support here is what turns "there's a paper and weights" into "you can call from_pretrained today." The single-command HfExporter also cuts the multi-step porting dance teams do to ship the same model to server GPUs, ONNX runtimes, and phones.
Who is it for?
ML engineers, model producers, teams porting models to ONNX / ExecuTorch, on-device inference
Frequently asked questions
- Which new models does Hugging Face Transformers 5.13.0 support out of the box?
- Transformers 5.13.0 adds native classes for Kimi K2.5, K2.6, and K2.7 from Moonshot; Xiaomi's MiMo-V2-Flash 27T MoE with a 256K window; Nvidia Nemotron 3.5 ASR and its streaming variant; Qwen3 ASR; Zyphra ZAYA (760M active / 8.4B total MoE); Google VideoPrism; Nvidia RADIO vision foundation models; and MiniCPM3, a 4B dense model using Multi-head Latent Attention.
- What does HfExporter do in Transformers 5.13.0?
- HfExporter is a unified export system with three backends — DynamoExporter for PyTorch AOT, OnnxExporter for ONNX/ORT/TensorRT, and ExecutorchExporter for mobile and edge. Transformers 5.13.0 wires dynamic shapes on by default and auto-splits prefill and decode paths for generative and vision-language models so a single export command handles both phases correctly.
- Are there breaking changes in Transformers 5.13.0?
- Yes — Transformers 5.13.0 standardizes layer declarations and cache construction across modeling code, fixes Gemma 3 and Gemma 4 sliding-window attention boundaries in a way that can change reproducibility, and corrects the Expert Parallelism router contract, which requires configuration updates in downstream code that used EP routing directly.
- How do I upgrade to Hugging Face Transformers 5.13.0?
- Upgrade Hugging Face Transformers 5.13.0 with pip install --upgrade transformers==5.13.0, or in a fresh environment pip install transformers==5.13.0. The release keeps Apache-2.0 licensing and is published on PyPI; the release notes on GitHub list per-model additions and the full breaking-change checklist for teams pinning to 5.x.
Try it
pip install --upgrade transformers==5.13.0