AI/TLDR

PaddlePaddle · 2026-06-22 · major

PP-OCRv6 — PaddlePaddle ships 50-language OCR family from 1.5M to 34.5M params

PP-OCRv6 is the next PaddleOCR family with Tiny (1.5M), Small (7.7M), and Medium (34.5M) tiers covering 50 languages. The Medium tier lifts detection Hmean to 86.2% and recognition accuracy to 83.2%, gains of 4.6 and 5.1 points over PP-OCRv5_server.

PP-OCRv6 hero banner from PaddlePaddle Hugging Face announcement

PaddlePaddle's PP-OCRv6 is a three-tier OCR family — Tiny 1.5M to Medium 34.5M — that recognises 50 languages and beats PP-OCRv5_server.

Key specs

Smallest tier1.5M params
Largest tier34.5M params
Languages50

Quick facts

MakerPaddlePaddle (Baidu open source)
TiersTiny 1.5M · Small 7.7M · Medium 34.5M
Languages50 (Simplified/Traditional Chinese, English, Japanese + 46 Latin)
Detection Hmean (Medium)86.2% (+4.6 vs PP-OCRv5_server)
Recognition accuracy (Medium)83.2% (+5.1 vs PP-OCRv5_server)
AvailabilityHugging Face, PaddleOCR docs, online demo

Benchmarks

Detection Hmean
PP-OCRv6 Medium (34.5M)86.2%
PP-OCRv6 Small (7.7M)84.1%
PP-OCRv6 Tiny (1.5M)80.6%
source ↗
Recognition accuracy
PP-OCRv6 Medium (34.5M)83.2%
PP-OCRv6 Small (7.7M)81.3%
PP-OCRv6 Tiny (1.5M)73.5%
source ↗

What is it?

PP-OCRv6 is the next generation of the PaddleOCR family, shipped as three open-weight tiers — Tiny (1.5M), Small (7.7M), and Medium (34.5M) parameters. The release covers 50 languages and targets documents, screenshots, industrial labels, and scene text.

How does it work?

Each PP-OCRv6 tier is a paired text-detection plus text-recognition model, tuned for a specific size/accuracy point. The Medium tier reports 86.2% detection Hmean and 83.2% recognition accuracy — 4.6 and 5.1 points above PP-OCRv5_server, the prior production-grade build.

Why does it matter?

Production OCR pipelines have to fit on devices that range from phones to GPU servers, and most teams pick one model and live with the trade-off. PP-OCRv6 ships three drop-in tiers so the same toolkit covers edge, mid-tier, and server inference without retraining or stitching different vendors together.

Who is it for?

Document-AI engineers, edge-device builders, multilingual OCR workflows

Frequently asked questions

How does PP-OCRv6 compare to PP-OCRv5?
PaddlePaddle reports the PP-OCRv6 Medium tier raises detection Hmean to 86.2% and recognition accuracy to 83.2%, gains of 4.6 and 5.1 percentage points over PP-OCRv5_server. The new family also widens the size range — a Tiny tier at 1.5M parameters now sits beside Small (7.7M) and Medium (34.5M) builds.
Which languages does PP-OCRv6 support?
PP-OCRv6 covers 50 languages, including Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin-script languages in the Small and Medium tiers. The model was tuned for mixed real-world inputs — documents, screenshots, industrial labels, and scene text — and is the largest language footprint in PaddleOCR's history.
Is PP-OCRv6 small enough to run on the edge?
Yes — the PP-OCRv6 Tiny tier is 1.5M parameters, which fits comfortably on mobile and embedded hardware. The trade-off is accuracy: Tiny reports 80.6% detection Hmean and 73.5% recognition accuracy, versus 86.2%/83.2% on the 34.5M Medium tier. The blog recommends Tiny for latency-sensitive devices and Medium for server inference.
Where do I download PP-OCRv6?
PP-OCRv6 weights live in the PaddlePaddle/PP-OCRv6 Hugging Face collection, with an online demo Space and full documentation on paddleocr.ai. The official launch is a PaddlePaddle blog post on Hugging Face dated June 22 2026, co-authored with Hugging Face engineers and PaddleOCR maintainers.

Try it

huggingface.co/spaces/PaddlePaddle/PP-OCRv6_Online_Demo

Sources · 2 outlets

Tags

  • ocr
  • paddlepaddle
  • paddleocr
  • multilingual
  • open-source
  • lightweight-models
  • edge-ai
  • baidu

← All releases · Learn AI