AI/TLDR

Data Formulator

Build and refine data visualizations with AI on an interactive canvas

Overview

Data Formulator is a Microsoft Research project for exploring data with visualizations powered by AI agents. It combines a visual UI with natural-language input, so you connect a data source, ask questions, and get charts you can edit, branch, and share on one interactive canvas.

It is aimed at two audiences: data and platform teams who wire up databases, warehouses, and BI sources once to give an org an AI-powered exploration layer, and analysts who want to ask, edit, and share insights without writing query and plotting code by hand.

As a data-app builder, it runs locally as a Python package and connects to your own model provider. It supports OpenAI, Azure, Ollama, and Anthropic through LiteLLM, and can read from sources like PostgreSQL, MySQL, MSSQL, BigQuery, S3, and Azure Blob, plus files, images, and text.

What it does

  • Natural-language plus direct-manipulation UI for creating and editing charts on a visual canvas
  • A unified Data Agent with thread memory that inspects data, runs sandboxed code, and recommends next steps
  • Data Thread keeps questions, intermediate results, and charts navigable so you can revisit steps, branch alternatives, and compare side by side
  • Connectors for databases, warehouses, BI systems, object stores, and files (PostgreSQL, MySQL, MSSQL, BigQuery, S3, Azure Blob, and more)
  • 30+ chart types via a semantic chart engine, plus a style-refinement agent for presentation-ready visuals
  • Bring-your-own model: OpenAI, Azure, Ollama, and Anthropic supported through LiteLLM

Getting started

Data Formulator runs locally as a Python package; install it with pip or run it instantly with uvx, then open the local web UI.

Install with pip

Install the package from PyPI into your Python environment.

bashbash
pip install data_formulator

Or run instantly with uvx

Use uvx to download and run it without a manual install step.

bashbash
uvx data_formulator

Start the app

Launch Data Formulator, then open the local URL it prints in your browser and configure your model provider (OpenAI, Azure, Ollama, or Anthropic).

bashbash
data_formulator

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Letting analysts ask questions of a connected database and get editable charts without writing SQL or plotting code
  • Giving a data or platform team a reusable AI exploration layer over warehouses and BI sources
  • Extracting structured data from Excel files, images, websites, or text and turning it into visualizations
  • Branching into alternative views of the same dataset and comparing them side by side, then exporting a report as image or PDF

How Data Formulator compares

Data Formulator alongside other open-source data app builders tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
Streamlit★ 45kA Python framework that turns scripts into interactive data and ML web apps with simple widget calls and no frontend code.
Gradio★ 43kA Python library for quickly building shareable web demos and UIs for machine learning models, APIs, and arbitrary functions.
Reflex★ 28.6kA framework for building full-stack web apps entirely in Python, compiling component code to a React frontend and Python backend.
Dash★ 24.3kA Python framework from Plotly for building analytical web dashboards and data apps with interactive charts and no JavaScript required.
marimo★ 21.5kA reactive Python notebook stored as plain Python that can be run as a script or deployed as an interactive data app.
NiceGUI★ 15.9kA backend-first Python UI framework built on FastAPI and Vue for creating web interfaces, dashboards, and internal tools.
Data Formulator★ 15.8kBuild and refine data visualizations with AI on an interactive canvas
Mesop★ 6.6kA Python UI framework, started at Google, for rapidly building AI demos and internal web apps using composable components.