Which new models does Hugging Face Transformers 5.13.0 support out of the box?

Transformers 5.13.0 adds native classes for Kimi K2.5, K2.6, and K2.7 from Moonshot; Xiaomi's MiMo-V2-Flash 27T MoE with a 256K window; Nvidia Nemotron 3.5 ASR and its streaming variant; Qwen3 ASR; Zyphra ZAYA (760M active / 8.4B total MoE); Google VideoPrism; Nvidia RADIO vision foundation models; and MiniCPM3, a 4B dense model using Multi-head Latent Attention.

What does HfExporter do in Transformers 5.13.0?

HfExporter is a unified export system with three backends — DynamoExporter for PyTorch AOT, OnnxExporter for ONNX/ORT/TensorRT, and ExecutorchExporter for mobile and edge. Transformers 5.13.0 wires dynamic shapes on by default and auto-splits prefill and decode paths for generative and vision-language models so a single export command handles both phases correctly.

Are there breaking changes in Transformers 5.13.0?

Yes — Transformers 5.13.0 standardizes layer declarations and cache construction across modeling code, fixes Gemma 3 and Gemma 4 sliding-window attention boundaries in a way that can change reproducibility, and corrects the Expert Parallelism router contract, which requires configuration updates in downstream code that used EP routing directly.

How do I upgrade to Hugging Face Transformers 5.13.0?

Upgrade Hugging Face Transformers 5.13.0 with pip install --upgrade transformers==5.13.0, or in a fresh environment pip install transformers==5.13.0. The release keeps Apache-2.0 licensing and is published on PyPI; the release notes on GitHub list per-model additions and the full breaking-change checklist for teams pinning to 5.x.

Hugging Face · 2026-07-03 · major

Hugging Face Transformers 5.13.0 — nine new architectures and unified HfExporter

Hugging Face Transformers 5.13.0 lands nine new model architectures — including Kimi K2.5–K2.7, MiMo-V2-Flash, Zyphra ZAYA, VideoPrism, RADIO, and MiniCPM3 — and introduces HfExporter, a unified export API covering PyTorch, ONNX, and ExecuTorch.

Hugging Face Transformers repository social preview

The reference AI library gains nine architectures in one drop plus one export command for every runtime.

Quick facts

Maker	Hugging Face
Version	5.13.0
Released	2026-07-03
License	Apache-2.0
New architectures	9 (Kimi K2.5–K2.7, MiMo-V2-Flash, Nemotron 3.5 ASR, Qwen3 ASR, ZAYA, VideoPrism, RADIO, MiniCPM3)
New feature	HfExporter — unified PyTorch, ONNX, ExecuTorch export
GitHub stars	162K

What is it?

Hugging Face Transformers 5.13.0 is the July 3 release of the reference Python library that defines how thousands of open-weight models load, run, and export. This drop widens the supported-model roster by nine architectures and folds three previously separate exporters into a single HfExporter API.

How does it work?

On the model side, Transformers 5.13.0 adds first-class classes for Kimi K2.5–K2.7, MiMo-V2-Flash, ZAYA, VideoPrism, RADIO, MiniCPM3, and two Nemotron / Qwen3 speech models. On the runtime side, HfExporter routes a single call through DynamoExporter, OnnxExporter, or ExecutorchExporter, with dynamic shapes on by default and prefill/decode splitting handled automatically.

Why does it matter?

Because Transformers 5.13.0 is what downstream frameworks and inference stacks build on, native-class support here is what turns "there's a paper and weights" into "you can call from_pretrained today." The single-command HfExporter also cuts the multi-step porting dance teams do to ship the same model to server GPUs, ONNX runtimes, and phones.

Who is it for?

ML engineers, model producers, teams porting models to ONNX / ExecuTorch, on-device inference

Frequently asked questions

Which new models does Hugging Face Transformers 5.13.0 support out of the box?: Transformers 5.13.0 adds native classes for Kimi K2.5, K2.6, and K2.7 from Moonshot; Xiaomi's MiMo-V2-Flash 27T MoE with a 256K window; Nvidia Nemotron 3.5 ASR and its streaming variant; Qwen3 ASR; Zyphra ZAYA (760M active / 8.4B total MoE); Google VideoPrism; Nvidia RADIO vision foundation models; and MiniCPM3, a 4B dense model using Multi-head Latent Attention.
What does HfExporter do in Transformers 5.13.0?: HfExporter is a unified export system with three backends — DynamoExporter for PyTorch AOT, OnnxExporter for ONNX/ORT/TensorRT, and ExecutorchExporter for mobile and edge. Transformers 5.13.0 wires dynamic shapes on by default and auto-splits prefill and decode paths for generative and vision-language models so a single export command handles both phases correctly.
Are there breaking changes in Transformers 5.13.0?: Yes — Transformers 5.13.0 standardizes layer declarations and cache construction across modeling code, fixes Gemma 3 and Gemma 4 sliding-window attention boundaries in a way that can change reproducibility, and corrects the Expert Parallelism router contract, which requires configuration updates in downstream code that used EP routing directly.
How do I upgrade to Hugging Face Transformers 5.13.0?: Upgrade Hugging Face Transformers 5.13.0 with pip install --upgrade transformers==5.13.0, or in a fresh environment pip install transformers==5.13.0. The release keeps Apache-2.0 licensing and is published on PyPI; the release notes on GitHub list per-model additions and the full breaking-change checklist for teams pinning to 5.x.

Try it

pip install --upgrade transformers==5.13.0