AI/TLDR

Liquid AI · 2026-04-08 · notable

LFM2.5-VL-450M — Liquid AI's Tiny Vision-Language Model for Edge

450M-parameter open-weight vision-language model with bounding box prediction, 8-language support, and sub-250ms inference on edge hardware from phones to Jetson Orin.

LFM2.5-VL-450M benchmark comparison chart showing improvements over previous version

A 450M-parameter vision-language model that runs on a phone, understands 8 languages, and can locate objects in images.

Key specs

Parameters450M
Ref coco m81.28 (from 0)
Mmmb multilingual68.09
Jetson orin latency242ms

What is it?

LFM2.5-VL-450M is an open-weight vision-language model from Liquid AI. At 450 million parameters, it is small enough to run on edge devices like the NVIDIA Jetson Orin, AMD Ryzen AI laptops, and flagship phones, while handling image understanding, object grounding with bounding boxes, and function calling.

How does it work?

The model uses LFM2.5-350M as its language backbone and SigLIP2 NaFlex as its vision encoder. Training scaled from 10T to 28T tokens compared to the previous version, followed by preference optimization and reinforcement learning. It produces structured JSON with normalized bounding box coordinates for visual grounding tasks.

Why does it matter?

Vision-language models that can ground objects in images usually require billions of parameters and cloud GPUs. LFM2.5-VL-450M brings this capability to edge devices at 242ms latency, enabling real-time visual understanding in drones, robots, and mobile apps without a network connection.

Who is it for?

Edge AI developers, robotics engineers, and mobile app developers.

Try it

playground.liquid.ai/chat?model=lfm2.5-vl-450m

Sources · 3 outlets

Tags

  • vision-language
  • edge-ai
  • on-device
  • open-weights
  • grounding
  • multilingual

← All releases · Learn AI