AI/TLDR

What Is Mastra? TypeScript Agents Beyond the Vercel AI SDK

Learn what Mastra adds on top of the Vercel AI SDK — durable workflows, agent memory, and built-in evals — for full agent apps in TypeScript.

INTERMEDIATE12 MIN READUPDATED 2026-06-12

In plain English

The Vercel AI SDK is a great engine. It gives you a unified interface over OpenAI, Anthropic, Google, and dozens of other model providers, plus streaming helpers and structured output. But an engine alone does not make a car. You still need seats, a steering wheel, a fuel tank, and a map.

The Third Eye - Icarus
The Third Eye - Icarus — ◄ K H Λ L Ξ D ►

Mastra is the assembled car. It is an open-source TypeScript framework built on top of the Vercel AI SDK that adds the primitives real production agents need: durable multi-step workflows, persistent memory across sessions, a built-in eval harness for catching regressions, RAG pipeline utilities, and a local dev studio for inspecting agent state. The team behind it previously built Gatsby, the React static-site framework.

Think of Mastra the way you think of Next.js relative to React. React renders UI; Next.js adds routing, server-side rendering, API routes, and a deployment model. Mastra follows the same pattern: the Vercel AI SDK handles the LLM call; Mastra adds the application layer that turns a single chat completion into a dependable, observable agent system.

Why it matters

The Vercel AI SDK solves the model-routing problem elegantly. But once you move past a chat UI demo and into a real product, you run into a wall of infrastructure problems it was never designed to solve.

The four gaps the AI SDK leaves open

  • Memory: your agent forgets everything between sessions. Building a memory layer from scratch means picking a vector database, writing embedding logic, managing thread IDs, handling context-window overflow, and wiring all of it together.
  • Durability: a multi-step agent that runs for several minutes will fail mid-execution the moment you redeploy your server. Nothing in the AI SDK preserves workflow state across restarts.
  • Evals: you have no automated way to detect when a prompt change silently degrades output quality. Without evals, regressions reach users.
  • Observability: tracing a multi-tool agent call through logs alone is painful. You need structured traces that link each LLM call, tool invocation, and step output into a single readable timeline.

Mastra ships answers to all four gaps as first-class primitives rather than leaving you to assemble them from third-party libraries. The result is that a TypeScript developer can go from npm create mastra@latest to a production-grade agent with memory, workflows, and evals in hours instead of weeks.

How it works

Mastra organises every agent application around five primitives: Agents, Tools, Workflows, Memory, and Evals. You compose these in a central Mastra instance that wires them together and exposes a local dev studio.

Agents

A Mastra agent wraps an LLM with a system prompt, a set of tools, and optionally a memory instance. Agents use LLM reasoning to decide which tools to call and iterate until they reach a stopping condition. Because Mastra delegates to the Vercel AI SDK under the hood, every model supported by the AI SDK — OpenAI, Anthropic, Google, Mistral, xAI, and more — is automatically available.

Minimal Mastra agenttypescript
import { Mastra } from '@mastra/core';
import { Agent } from '@mastra/core/agent';
import { openai } from '@ai-sdk/openai';

const researchAgent = new Agent({
  name: 'Research Agent',
  instructions: 'You are a helpful research assistant.',
  model: openai('gpt-4o'),
});

export const mastra = new Mastra({
  agents: { researchAgent },
});

Workflows

Where agents let the LLM decide execution order, Mastra workflows give you deterministic control. A workflow is a graph-based state machine built from typed steps. Each step declares its inputSchema and outputSchema using Zod (or Valibot, ArkType, or plain JSON Schema), so data flowing between steps is validated at every boundary.

Steps chain together with .then() for sequential execution, .branch() for conditionals, and .parallel() for fan-out. The workflow is finalised with .commit() and started with run.start(). Workflows can also be suspended mid-execution — for a human approval step, an API webhook, or rate-limit throttling — and resumed later from the exact snapshot that was saved.

A two-step Mastra workflowtypescript
import { createStep, createWorkflow } from '@mastra/core/workflows';
import { z } from 'zod';

const fetchStep = createStep({
  id: 'fetch',
  inputSchema: z.object({ url: z.string() }),
  outputSchema: z.object({ body: z.string() }),
  execute: async ({ inputData }) => {
    const res = await fetch(inputData.url);
    return { body: await res.text() };
  },
});

const summariseStep = createStep({
  id: 'summarise',
  inputSchema: z.object({ body: z.string() }),
  outputSchema: z.object({ summary: z.string() }),
  execute: async ({ inputData, mastra }) => {
    const agent = mastra.getAgent('researchAgent');
    const result = await agent.generate(`Summarise: ${inputData.body}`);
    return { summary: result.text };
  },
});

export const fetchAndSummarise = createWorkflow({
  id: 'fetch-and-summarise',
  inputSchema: z.object({ url: z.string() }),
  outputSchema: z.object({ summary: z.string() }),
})
  .then(fetchStep)
  .then(summariseStep)
  .commit();

Memory

Mastra ships @mastra/memory with three memory modes. Working memory stores structured facts about a user — name, preferences, goals — that persist across all conversations. Semantic recall embeds past messages and retrieves the most relevant ones by meaning when the context window would overflow. Observational memory uses a background compression agent to replace raw message history with a dense observation log, keeping the active context small while preserving long-term recall.

Memory requires a storage backend. The quickstart uses @mastra/libsql (SQLite-compatible). Production deployments can swap in Postgres, Upstash Redis, or any community-maintained adapter. The storage layer also persists workflow snapshots, so suspend/resume works across application restarts and redeployments.

Evals

Mastra treats evals as first-class primitives rather than a testing afterthought. You attach eval functions directly to an agent definition. Built-in scorers cover accuracy, answer relevance, context recall, and toxicity. Custom evals can use model-graded scoring, rule-based checks, or statistical measures. Evals run in CI and surface regressions before a prompt change ships to users.

Mastra vs the Vercel AI SDK

The most common question about Mastra is whether it replaces the Vercel AI SDK or competes with it. The answer is neither: Mastra depends on the AI SDK and re-exports its streaming interface for use in React UIs. The @mastra/ai-sdk package provides drop-in route handlers so the same Mastra agent streams to useChat in a Next.js app without any glue code.

CapabilityVercel AI SDK aloneMastra
Model routing (OpenAI, Anthropic, etc.)YesYes (delegates to AI SDK)
Streaming to React useChat / useCompletionYesYes (via @mastra/ai-sdk)
Structured output + tool callingYesYes
Durable multi-step workflowsNoYes — createWorkflow
Persist memory across sessionsNoYes — @mastra/memory
Built-in evalsNoYes
RAG pipeline helpersNoYes
MCP server authoringNoYes
Local dev studio (Mastra Studio)NoYes
Suspend / resume mid-workflowNoYes

Getting started in five minutes

The fastest path is the Mastra CLI scaffold. It creates a project, installs dependencies, and drops in example agents and tools you can edit immediately.

Scaffold a new Mastra projectbash
# Scaffold with OpenAI, agents + tools components, and an example
npm create mastra@latest my-agent -- --components agents,tools --llm openai --example

cd my-agent
npm install

# Start the local dev studio at http://localhost:4111
npm run dev

Mastra Studio opens in the browser and shows every registered agent, lets you send test messages, inspect memory threads, trace tool calls step by step, and run evals interactively. It replaces the console.log-driven debug loop that most developers default to when building agents.

Adding memory to an existing agent

Agent with persistent memorytypescript
import { Memory } from '@mastra/memory';
import { LibSQLStore } from '@mastra/libsql';
import { Agent } from '@mastra/core/agent';
import { openai } from '@ai-sdk/openai';

const memory = new Memory({
  storage: new LibSQLStore({ url: 'file:local.db' }),
  options: {
    lastMessages: 20,
    semanticRecall: { topK: 5 },
  },
});

const assistantAgent = new Agent({
  name: 'Assistant',
  instructions: 'You are a helpful assistant that remembers the user.',
  model: openai('gpt-4o-mini'),
  memory,
});

// Pass resource + thread IDs to scope memory per user/conversation
const response = await assistantAgent.generate(
  'What did we talk about last week?',
  { resourceId: 'user-123', threadId: 'thread-456' },
);

Common pitfalls and tradeoffs

Mastra is a fast-moving framework that reached v1.0 in January 2026, so some rough edges remain. Knowing the common traps ahead of time saves debugging time.

Pitfalls to watch out for

  • Schema drift between taxonomy and articles. If you use Mastra to build a content pipeline, keep your Zod schemas in sync across steps. Mastra validates at step boundaries, so a mismatch surfaces as a runtime error rather than a silent data corruption.
  • Workflow suspend/resume requires a persistent store. Suspend only works if you configure a storage backend. Omitting storage causes suspended workflows to be lost on restart. The default in-memory store is development-only.
  • Agent memory and workflow state share the same storage. For high-volume workloads, size your Postgres or LibSQL instance to handle both. Memory compression (observational memory) is recommended once threads grow past a few hundred messages.
  • Evals run sequentially by default. For large eval suites, configure parallelism or use the Mastra cloud eval runner to avoid CI timeout issues.
  • MCP server exports are stateless. When you expose a Mastra agent as an MCP server, memory is not automatically passed from the calling host. You must configure a shared storage backend and pass consistent resource/thread IDs from the client.

When Mastra is overkill

Mastra adds framework overhead — package installs, a Mastra instance, a storage backend, and a local studio process. For a simple one-off script that calls an LLM once and prints the result, the Vercel AI SDK alone is lighter and faster to set up. Mastra's value compounds as the agent grows: more steps, more sessions, more models, and more need for quality assurance.

Going deeper

Once you are comfortable with the core primitives, Mastra has several advanced capabilities worth exploring.

Deploying to durable execution engines

By default, Mastra runs workflows in its built-in in-process engine. For workflows that must survive server restarts, deployment events, or long waits (hours, days), Mastra supports swapping in Inngest as a durable workflow runner. You configure an Inngest adapter on the workflow and Mastra handles the serialisation, scheduling, and state recovery. Your step code does not change.

Multi-agent networks

Mastra supports agent networks where a router agent delegates subtasks to specialised sub-agents. Each sub-agent can have its own model, tools, and memory scope. The router receives all sub-agent outputs and synthesises a final answer. This pattern scales to complex tasks — competitive research, code review pipelines, multi-language customer support — without requiring a monolithic agent prompt.

RAG with Mastra

Mastra ships RAG primitives for chunking documents, embedding them into a vector store, and querying by semantic similarity. Supported vector backends include Postgres with pgvector, Pinecone, Qdrant, and others via community adapters. The embedding step accepts any AI SDK-compatible embedding model, so you can use OpenAI text-embedding-3-small, Cohere, or a local model.

Observability and tracing

Every Mastra operation emits OpenTelemetry traces. Connect a compatible backend — Langfuse, Braintrust, Jaeger, or your existing OTEL collector — and you get end-to-end visibility: which model was called, which tools fired, how long each step took, and what the input/output was at every stage. For teams already on an OTEL stack, zero additional instrumentation code is required.

The model router

Mastra maintains a model index at mastra.ai listing over 3,300 models from 94 providers as of March 2026. Every model is accessible through the same Vercel AI SDK LanguageModel interface, so switching from gpt-4o to claude-opus-4 or gemini-2.0-flash is a one-line change with no other code modification.

FAQ

Does Mastra replace the Vercel AI SDK?

No — Mastra is built on top of the Vercel AI SDK and depends on it. It uses the AI SDK for all LLM calls, model routing, and streaming. The @mastra/ai-sdk package re-exports streaming helpers so the same agent that runs in a Mastra workflow can also stream to a Next.js useChat hook.

Is Mastra free and open source?

The core framework (@mastra/core, @mastra/memory, workflow engine, evals) is open source under the MIT/Apache-2.0 licence and free to self-host. Mastra also offers a managed cloud platform with hosted workflow execution and an eval dashboard, which is a separate commercial product.

What storage backends does Mastra memory support?

Mastra ships official adapters for LibSQL (SQLite-compatible, great for local dev), Postgres, and Upstash Redis. The storage interface is pluggable, so community adapters exist for other databases. All three built-in options support both agent memory and workflow snapshot persistence.

Can I use Mastra with models other than OpenAI?

Yes. Because Mastra delegates to the Vercel AI SDK, any model provider the AI SDK supports — Anthropic, Google, Mistral, Cohere, xAI, Groq, Amazon Bedrock, and local models via Ollama — works out of the box. You simply pass a different model object to the model field in your agent or step.

How do Mastra workflows compare to plain async TypeScript functions?

Plain async functions have no persistence — if the process restarts mid-execution, you lose all state. Mastra workflows serialise each step's output to storage, so they survive deploys and crashes. They also give you type-validated data at step boundaries, structured traces, and suspend/resume for human-in-the-loop flows, none of which you get for free with raw async/await.

What is Mastra Studio?

Mastra Studio is a local web UI (runs on localhost:4111 by default) that starts alongside your dev server. It lets you chat with registered agents, inspect memory threads, step through workflow runs visually, view OTEL traces, and run evals interactively. It requires no external service — everything runs in-process during development.

Further reading