Liquid AI · 2026-04-08 · notable

LFM2.5-VL-450M — Liquid AI's Tiny Vision-Language Model for Edge

Item: LFM2.5-VL-450M — Liquid AI's Tiny Vision-Language Model for Edge
Rating: 3
Author: AI/TLDR

450M-parameter open-weight vision-language model with bounding box prediction, 8-language support, and sub-250ms inference on edge hardware from phones to Jetson Orin.

LFM2.5-VL-450M benchmark comparison chart showing improvements over previous version

A 450M-parameter vision-language model that runs on a phone, understands 8 languages, and can locate objects in images.

Key specs

Parameters	450M
Ref coco m	81.28 (from 0)
Mmmb multilingual	68.09
Jetson orin latency	242ms

What is it?

LFM2.5-VL-450M is an open-weight vision-language model from Liquid AI. At 450 million parameters, it is small enough to run on edge devices like the NVIDIA Jetson Orin, AMD Ryzen AI laptops, and flagship phones, while handling image understanding, object grounding with bounding boxes, and function calling.

How does it work?

The model uses LFM2.5-350M as its language backbone and SigLIP2 NaFlex as its vision encoder. Training scaled from 10T to 28T tokens compared to the previous version, followed by preference optimization and reinforcement learning. It produces structured JSON with normalized bounding box coordinates for visual grounding tasks.

Why does it matter?

Vision-language models that can ground objects in images usually require billions of parameters and cloud GPUs. LFM2.5-VL-450M brings this capability to edge devices at 242ms latency, enabling real-time visual understanding in drones, robots, and mobile apps without a network connection.

Who is it for?

Edge AI developers, robotics engineers, and mobile app developers.

Try it

playground.liquid.ai/chat?model=lfm2.5-vl-450m