AI/TLDR

LLaMA-Factory

Fine-tune 100+ open LLMs and multimodal models with one CLI or a web UI

Overview

LLaMA-Factory is an open-source training suite that lets you pre-train, supervised fine-tune, and run RLHF on large language models from one place. It supports 100+ text and multimodal model families, including LLaMA, Qwen3, Mistral, Mixtral-MoE, DeepSeek, Gemma, GLM, Phi, and LLaVA, and you drive everything through a command-line tool or a Gradio web UI called LLaMA Board.

It is aimed at ML engineers and researchers who want to adapt an existing open model to their own data without writing a custom training loop. You describe a training run in a YAML config and pass it to the CLI, or fill in the fields in the web UI, so most common fine-tuning recipes work with little or no code.

As a fine-tuning framework, it bundles the methods you would otherwise wire together yourself: full and freeze tuning, LoRA and 2-8 bit QLoRA, reward modeling, PPO, DPO, KTO, and ORPO, plus speed-ups like FlashAttention-2, Unsloth, and Liger Kernel. After training it can serve the model through an OpenAI-style API backed by a vLLM or SGLang worker.

What it does

  • Supports 100+ models, including LLaMA, Qwen3, Qwen3-VL, Mistral, Mixtral-MoE, DeepSeek, Gemma, GLM, Phi, and LLaVA
  • Covers (continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, and ORPO
  • Scales from 16-bit full tuning and freeze-tuning to LoRA and 2/3/4/5/6/8-bit QLoRA via AQLM, AWQ, GPTQ, and others
  • Includes training speed-ups such as FlashAttention-2, Unsloth, Liger Kernel, RoPE scaling, and NEFTune
  • Zero-code workflows via the llamafactory-cli command line and the LLaMA Board web UI
  • Serves trained models through an OpenAI-style API with a vLLM or SGLang worker, plus experiment tracking via TensorBoard, W&B, MLflow, and SwanLab

Getting started

Install LLaMA-Factory from source, then run a LoRA fine-tuning, chat, and export cycle using the bundled example configs.

Install from source

Clone the repository and install it in editable mode along with the metrics requirements.

bashbash
git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git
cd LlamaFactory
pip install -e .
pip install -r requirements/metrics.txt

Run LoRA fine-tuning

Train a LoRA adapter using a provided example YAML config. Each config sets the model, dataset, and training arguments.

bashbash
llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml

Chat with the fine-tuned model and export it

Load the adapter to chat in the terminal, then merge it into a standalone model.

bashbash
llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml

Use the web UI instead

Prefer a no-code workflow? Launch LLaMA Board to configure and run training in the browser.

bashbash
llamafactory-cli webui

Commands and code are distilled from the project's own documentation — always check the official repo for the latest.

When to use it

  • Adapt an open base model like Qwen3 or LLaMA to a domain dataset with LoRA or QLoRA on limited GPU memory
  • Align a model to human preferences using DPO, PPO, KTO, or ORPO without building a custom RLHF pipeline
  • Fine-tune multimodal models for image understanding, visual grounding, or video and audio tasks
  • Let non-experts run and compare training jobs through the LLaMA Board web UI, then serve results via an OpenAI-style API

How LLaMA-Factory compares

LLaMA-Factory alongside other open-source fine-tuning frameworks tools AI/TLDR tracks, ranked by GitHub stars.

ToolStarsWhat it does
LLaMA-Factory★ 72.3kFine-tune 100+ open LLMs and multimodal models with one CLI or a web UI
Unsloth★ 66.9kA library that speeds up LoRA and QLoRA fine-tuning while cutting memory use, aimed at training models on a single GPU.
PEFT★ 21.3kHugging Face's library of parameter-efficient fine-tuning methods such as LoRA, DoRA, and prompt tuning that train small adapters instead of full models.
FinGPT★ 20.5kFinGPT is an open-source project of financial LLMs, fine-tuned with LoRA on news and tweet data for tasks like sentiment analysis, relation extraction, and stock-move forecasting.
ms-swift★ 14.6kModelScope's framework for fine-tuning and deploying 600+ LLMs and 300+ multimodal models, supporting PEFT and full-parameter SFT, DPO, and GRPO.
LitGPT★ 13.4kAn open-source toolkit from Lightning AI to pretrain, finetune, and serve 20+ large language models, each written from scratch for speed and full control.
Axolotl★ 12.1kA config-driven tool for fine-tuning and post-training open LLMs that supports SFT, LoRA/QLoRA, DPO, GRPO, and multi-GPU training across many model families.
Ludwig★ 11.7kLudwig is a low-code framework that lets you train, fine-tune, and deploy LLMs, multimodal, and tabular models using a YAML config instead of boilerplate Python.