OpenRLHF

Scalable RLHF training for large language models on Ray and vLLM

github.com/OpenRLHF/OpenRLHF★ 9.7k openrlhf.readthedocs.io

Overview

OpenRLHF is an open-source framework for reinforcement learning from human feedback (RLHF), the process of aligning a language model with human preferences after pretraining. It is built on Ray for distributed scheduling and vLLM for fast generation, with DeepSpeed handling training. This split lets it place the actor, reward, reference, and critic models on separate GPUs and scale to models with 70B+ parameters.

It is aimed at ML engineers and researchers who want to run the full RLHF pipeline—supervised fine-tuning, reward modeling, and reinforcement learning—without building the distributed plumbing themselves. It ships command-line entry points and example scripts for common setups, so you can start from a working recipe and adjust it.

Within the RLHF and alignment category, OpenRLHF focuses on the training infrastructure rather than on datasets or evaluation. It supports several RL algorithms (PPO, GRPO, REINFORCE++, RLOO) and includes a hybrid engine mode that lets models and vLLM engines share GPU resources to reduce idle time on limited hardware.

What it does

Built on a Ray + vLLM + DeepSpeed distributed stack that separates actor, reward, reference, and critic models across GPUs
Supports multiple RL algorithms: PPO, GRPO, REINFORCE++, and RLOO
Scales RLHF training to models with 70B+ parameters
Hybrid engine scheduling lets models and vLLM engines share GPUs to cut idle time
Covers the full pipeline: supervised fine-tuning, reward modeling, and RL training
Async RLHF and agent-based RLHF via --train.async_enable and --train.agent_func_path, plus optional LoRA

Getting started

Install OpenRLHF with pip (vLLM is an optional extra), then launch one of its CLI training entry points such as supervised fine-tuning.

Install with pip

Install the base package, or add the vLLM extra for generation acceleration used during RL training.

bashbash

pip install openrlhf            # Basic
pip install openrlhf[vllm]      # + vLLM

Or install from source

Clone the repository and install in editable mode if you want to modify the code or run the example scripts.

bashbash

git clone https://github.com/OpenRLHF/OpenRLHF.git
cd OpenRLHF
pip install -e .

Run supervised fine-tuning

Launch the SFT entry point with DeepSpeed. This example fine-tunes Llama 3 8B on the OpenOrca dataset.

bashbash

deepspeed --module openrlhf.cli.train_sft \
   --data.max_len 4096 \
   --data.dataset Open-Orca/OpenOrca \
   --data.input_key question \
   --data.output_key response \
   --train.batch_size 256 \
   --train.micro_batch_size 2 \
   --actor.model_name_or_path meta-llama/Meta-Llama-3-8B \
   --ckpt.output_dir ./checkpoint/llama3-8b-sft \
   --ds.zero_stage 2 \
   --train.max_epochs 1 \
   --adam.lr 5e-6

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

Aligning an open-weight LLM with human preferences using PPO, GRPO, or REINFORCE++
Running RLHF training on large models (up to 70B+) across multiple GPUs without writing the distributed scheduling yourself
Reproducing reasoning-model RL recipes (e.g. DeepSeek-R1-style training) from the provided example scripts
Training reward models and running supervised fine-tuning as the first stages of an alignment pipeline

How OpenRLHF compares

OpenRLHF alongside other open-source rlhf & alignment tools AI/TLDR tracks, ranked by GitHub stars.

Tool	Stars	What it does
Open-R1	★ 26.3k	An open reproduction of the DeepSeek-R1 reasoning pipeline, with scripts for GRPO training and reasoning-data generation.
verl	★ 22.1k	Volcano Engine's RL post-training framework (HybridFlow) for building GRPO, PPO, and other RL pipelines on top of FSDP, Megatron, and vLLM.
TRL	★ 18.7k	Hugging Face's post-training library with trainers for SFT, reward modeling, DPO, PPO, and GRPO to align language models with preferences.
Agent Lightning	★ 17.3k	An open-source trainer from Microsoft that improves AI agents built with any framework using reinforcement learning, prompt optimization, and supervised fine-tuning.
ART	★ 10.1k	OpenPipe's Agent Reinforcement Trainer for post-training LLM agents on multi-step tasks using GRPO and rule- or judge-based rewards.
OpenRLHF	★ 9.7k	Scalable RLHF training for large language models on Ray and vLLM
Alignment Handbook	★ 5.6k	A set of recipes and scripts from Hugging Face showing how to run the full SFT-then-preference-alignment pipeline used to build aligned chat models.
Verifiers	★ 4.2k	A library for defining verifiable-reward environments and running reinforcement-learning fine-tuning of LLMs against those rewards.

// Overview

// What it does

// Getting started

Install with pip

Or install from source

Run supervised fine-tuning

// When to use it

// How OpenRLHF compares

Overview

What it does

Getting started

When to use it

How OpenRLHF compares