AI/TLDR

TextGrad

Optimize prompts and text variables with backpropagation through LLM feedback

Overview

TextGrad is a Python framework that brings automatic "differentiation" to text. Instead of numeric gradients, it backpropagates feedback written by an LLM, so you can improve prompts, answers, code, and other text variables through a loop that mirrors how neural networks are trained.

The API deliberately follows PyTorch. You wrap text in a Variable, define a loss with TextLoss, and call loss.backward() and optimizer.step() using TGD (Textual Gradient Descent). If you know PyTorch, most of the concepts carry over directly.

It fits the prompt-programming space as an optimization layer: rather than hand-tuning prompts, you let the framework critique and revise text against an evaluation instruction. It works with many model providers through a litellm-based engine, including OpenAI, Anthropic, Gemini, Bedrock, and Together.

What it does

  • PyTorch-style API: Variable, TextLoss, backward(), and a TGD (Textual Gradient Descent) optimizer
  • Backpropagation through natural-language feedback produced by an LLM
  • Optimizes many kinds of text variables — prompts, answers, code, and solutions
  • BlackboxLLM wrapper for running a model forward pass on a Variable
  • Experimental litellm engine supporting OpenAI, Anthropic, Gemini, Bedrock, Together, and more
  • Optional response caching and multimodal (image) input via the litellm engine

Getting started

Install the package, set your model API key, then run a short optimize-the-answer loop that mirrors PyTorch.

Install TextGrad

Install from PyPI. You'll also need an API key (for example OpenAI or Anthropic) set in your environment.

bashbash
pip install textgrad

Set the backward engine and run a forward pass

Pick a model as the backward engine, then wrap your question in a Variable and get an initial answer from a BlackboxLLM.

pythonpython
import textgrad as tg

tg.set_backward_engine("gpt-4o", override=True)

model = tg.BlackboxLLM("gpt-4o")
question_string = ("If it takes 1 hour to dry 25 shirts under the sun, "
                   "how long will it take to dry 30 shirts under the sun? "
                   "Reason step by step")

question = tg.Variable(question_string,
                       role_description="question to the LLM",
                       requires_grad=False)

answer = model(question)

Define a loss and optimize the answer

Write an evaluation instruction as a TextLoss, then run the backward pass and optimizer step — the same syntax as PyTorch — to revise the answer.

pythonpython
answer.set_role_description("concise and accurate answer to the question")

optimizer = tg.TGD(parameters=[answer])
evaluation_instruction = (f"Here's a question: {question_string}. "
                          "Evaluate any given answer to this question, "
                          "be smart, logical, and very critical. "
                          "Just provide concise feedback.")

loss_fn = tg.TextLoss(evaluation_instruction)

loss = loss_fn(answer)
loss.backward()
optimizer.step()

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Automatically refine an LLM's reasoning or answer when a first response is wrong
  • Tune system prompts and instructions instead of editing them by hand
  • Improve generated code or problem solutions through iterative LLM critique
  • Optimize text variables in research workflows across multiple model providers via litellm

How TextGrad compares

TextGrad alongside other open-source prompt programming tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
DSPy★ 35.8kA Stanford framework for programming language models with composable modules and automatic prompt optimization instead of hand-written prompts.
ell★ 5.9kA Python library that treats prompts as versioned functions, with tooling to track, visualize, and iterate on them as code.
GEPA★ 5.5kA reflective, evolutionary optimizer that improves prompts and other text components of a system using language-model feedback.
LMQL★ 4.2kA query language for LLMs that mixes Python control flow with prompts and constraints to script multi-step generation.
AdalFlow★ 4.2kA PyTorch-like library for building and auto-optimizing LLM pipelines, tuning prompts across the components of a task.
TextGrad★ 3.6kOptimize prompts and text variables with backpropagation through LLM feedback
Mirascope★ 1.5kA lightweight Python toolkit for writing LLM calls as typed functions with prompt templates, chaining, and a single interface across providers.