AI/TLDR

Fine-Tuning & Model Customization

Changing the weights — SFT, LoRA, QLoRA, RLHF, DPO, distillation, and when not to bother.

Fine-Tuning Fundamentals

What fine-tuning can (and can't) change, and how to prepare for it.

LoRA & Efficient Methods

Parameter-efficient tuning that fits on a single GPU.

RLHF & Preference Training

How raw models learn what humans want: RLHF, DPO, reward models, GRPO.

Distillation & Training Tools

Smaller models from bigger ones, synthetic data, and the toolkits that run the job.