Alignment Handbook

Hugging Face recipes and scripts for the full SFT-then-preference-alignment pipeline

github.com/huggingface/alignment-handbook★ 5.6k

Overview

The Alignment Handbook is a collection of training recipes and scripts from Hugging Face for building aligned chat models. It walks through the whole pipeline used to produce models like Zephyr-7B: continued pretraining, supervised fine-tuning (SFT) for chat, and preference alignment with DPO or ORPO.

It is aimed at ML engineers and researchers who want to reproduce known aligned models or train their own on custom datasets. Each training run is described by a single YAML recipe file that holds all the parameters, so you change configs rather than rewriting code.

Within the fine-tuning and RLHF/alignment space, it sits one level above raw training libraries: the scripts are thin wrappers over the Hugging Face stack, and they support full-weight distributed training with DeepSpeed ZeRO-3 as well as parameter-efficient LoRA/QLoRA.

What it does

Covers the full pipeline: continued pretraining, SFT, reward modeling, rejection sampling, DPO, and ORPO
Reproducible recipes as YAML files (e.g. Zephyr-7B, SmolLM, StarChat2) that hold every parameter for a run
Ready-made scripts for SFT (sft.py) and preference alignment (dpo.py) launched with accelerate
Supports full-weight distributed training via DeepSpeed ZeRO-3, plus LoRA/QLoRA for parameter-efficient tuning
Instructions and formatting guidance for fine-tuning chat models on your own datasets
Built on the Hugging Face ecosystem with documented dataset and model collections

Getting started

Set up a Python environment with the pinned dependencies, then launch a training run from one of the provided recipe YAML files. The example below reproduces the Zephyr-7B-beta pipeline.

Create a virtual environment and install dependencies

Use uv to create an environment, install the pinned PyTorch build, then install the handbook package and Flash Attention 2.

bashbash

uv venv handbook --python 3.11 && source handbook/bin/activate && uv pip install --upgrade pip
uv pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu126
uv pip install .
uv pip install "flash-attn==2.7.4.post1" --no-build-isolation

Log in to Hugging Face

Authenticate so you can pull base models and datasets and push your trained model.

bashbash

huggingface-cli login

Run supervised fine-tuning (SFT)

Launch the SFT script with a recipe config, using the ZeRO-3 accelerate config for distributed training.

bashbash

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml scripts/sft.py --config recipes/zephyr-7b-beta/sft/config_full.yaml

Align with DPO

Take the SFT model and align it to preferences using direct preference optimization.

bashbash

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/zero3.yaml scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_full.yaml

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Reproduce a published aligned chat model such as Zephyr-7B from its recipe
Fine-tune and align an open base model on your own instruction and preference datasets
Compare alignment methods (DPO vs. KTO vs. IPO, or ORPO) using the included recipes
Adapt a model to a new language or domain through continued pretraining, then SFT and DPO

How Alignment Handbook compares

Alignment Handbook alongside other open-source rlhf & alignment tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Open-R1	★ 26.3k	An open reproduction of the DeepSeek-R1 reasoning pipeline, with scripts for GRPO training and reasoning-data generation.
verl	★ 22.1k	Volcano Engine's RL post-training framework (HybridFlow) for building GRPO, PPO, and other RL pipelines on top of FSDP, Megatron, and vLLM.
TRL	★ 18.7k	Hugging Face's post-training library with trainers for SFT, reward modeling, DPO, PPO, and GRPO to align language models with preferences.
Agent Lightning	★ 17.3k	An open-source trainer from Microsoft that improves AI agents built with any framework using reinforcement learning, prompt optimization, and supervised fine-tuning.
ART	★ 10.1k	OpenPipe's Agent Reinforcement Trainer for post-training LLM agents on multi-step tasks using GRPO and rule- or judge-based rewards.
OpenRLHF	★ 9.7k	A Ray- and vLLM-based RLHF framework that scales PPO, GRPO, and REINFORCE++ training to models with 70B+ parameters.
Alignment Handbook	★ 5.6k	Hugging Face recipes and scripts for the full SFT-then-preference-alignment pipeline
Verifiers	★ 4.2k	A library for defining verifiable-reward environments and running reinforcement-learning fine-tuning of LLMs against those rewards.

// Overview

// What it does

// Getting started

Create a virtual environment and install dependencies

Log in to Hugging Face

Run supervised fine-tuning (SFT)

Align with DPO

// When to use it

// How Alignment Handbook compares

Overview

What it does

Getting started

When to use it

How Alignment Handbook compares