Fine-tuning is the process of adapting a pre-trained model for a specific task or use case by continuing to train it on a smaller, targeted dataset. Rather than building a model from scratch, fine-tuning uses a foundational model's accumulated knowledge as a starting point—making it faster, cheaper, and less data-intensive than training a model from scratch.
There are various approaches one can take to fine-tuning.
- Full fine-tuning updates all model weights but is computationally expensive.
- Additive methods like adapters freeze the original weights entirely and instead train small, task-specific layers inserted into the neural network. It achieves comparable performance while touching only a fraction of the parameters.
- LoRA (Low Rank Adaptation) takes a different approach, freezing the original weights and instead optimizing a compact representation of the updates to those weights, dramatically reducing memory requirements and training time.
Fine-tuning is different from pre-training. The latter starts from randomly initialized parameters and learns entirely from a large dataset over many iterations.
Fine-tuning is especially useful for deep learning models with millions or even billions of parameters, such as large language models (LLMs) used in natural language processing (NLP) and highly complex architectures for tasks like image recognition and medical diagnosis.