AI Glossary: Training, Fine-Tuning & Data
Understanding how models are built and adapted helps designers recognize capability sources, appreciate customization options, and understand training-related behaviors and limitations.
Pre-Training Concepts
Pre-training
The initial, computationally intensive phase of training on massive unlabeled text. During pre-training, models learn language patterns, grammar, factual knowledge, and reasoning capabilities by processing billions to trillions of tokens. This creates "foundation models" later adapted through fine-tuning.
Self-Supervised Learning
Learning where models generate their own training labels from data structure rather than human annotation. For LLMs, this involves predicting masked or next tokens—the model learns by checking predictions against actual text. This enables training on vast unlabeled internet text.
Next-Token Prediction
The core training objective for decoder models: predict the most probable next token given all preceding tokens. Despite simplicity, this objective underlies pre-training, fine-tuning, and inference for all causal language models.
Reference: Radford, A. et al., "Language Models are Unsupervised Multitask Learners" (GPT-2), OpenAI 2019
Masked Language Modeling (MLM)
Self-supervised technique used by encoder models like BERT: randomly mask ~15% of tokens, then predict them from surrounding bidirectional context. Effective for understanding tasks rather than generation.
Training Corpus
The complete dataset used for training, typically containing billions to trillions of tokens from diverse sources. Corpus quality, diversity, and size directly impact model capabilities. Modern corpora undergo extensive filtering for quality, deduplication, and safety.
Reference: Brown, T. et al., "Language Models are Few-Shot Learners" (GPT-3), NeurIPS 2020
Data Curation
Filtering, cleaning, and organizing training data for quality and diversity—removing duplicates, toxic content, OCR errors, and PII while balancing domain representation. Well-curated smaller datasets can outperform larger unfiltered ones.
Reference: Gao, L. et al., "The Pile: An 800GB Dataset of Diverse Text for Language Modeling", 2020
Scaling Laws
Mathematical relationships describing how performance improves predictably as you increase parameters, data, and compute. The "Chinchilla" laws established optimal training requires ~20 tokens per parameter. Recent "overtrained" models use 10-75x more tokens for better inference efficiency.
Compute Budget
Total computational resources for training, measured in FLOPs or GPU-hours. Training frontier models costs millions of dollars, so teams carefully balance model size, dataset size, and training duration.
Reference: Kaplan, J. et al., "Scaling Laws for Neural Language Models", OpenAI 2020
Fine-Tuning Methods
Fine-tuning
Adapting pre-trained models to specific tasks by continuing training on targeted datasets. Far more efficient than training from scratch—requires orders of magnitude less data and compute while leveraging general pre-trained knowledge.
Supervised Fine-Tuning (SFT)
Training on labeled input-output pairs with human-created examples. For chat models, this involves conversations demonstrating desired behavior. SFT teaches format and style; it's typically the first step after pre-training and before alignment techniques.
Instruction Tuning
Fine-tuning to follow natural language instructions across diverse tasks. Training data consists of (instruction, response) pairs covering Q&A, summarization, coding. Models like FLAN-T5 demonstrate that instruction tuning dramatically improves generalization to new tasks.
Reference: Wei, J. et al., "Finetuned Language Models Are Zero-Shot Learners" (FLAN), ICLR 2022
RLHF (Reinforcement Learning from Human Feedback)
Three-step alignment technique: (1) collect human preference data comparing outputs, (2) train a reward model on preferences, (3) use reinforcement learning (PPO) to optimize against the reward model. RLHF enables training on subjective qualities like helpfulness and safety.
RLAIF (RL from AI Feedback)
RLHF variant where AI models generate preference labels instead of humans, enabling scalable alignment without extensive human annotation. Research shows comparable performance while dramatically reducing cost.
Reference: Lee, H. et al., "RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback", 2023
DPO (Direct Preference Optimization)
Simpler RLHF alternative that skips reward model training, directly optimizing on preference data via classification loss. Introduced in 2023, DPO requires only two model copies (vs. four for PPO), is more stable, and has become widely adopted in open-source training.
PPO (Proximal Policy Optimization)
Reinforcement learning algorithm used in RLHF, updating the model while constraining changes to stay close to previous behavior. Requires significant resources (four model copies) but remains the method used by leading labs.
Reward Model
Model trained to predict human preferences, assigning scores based on helpfulness, accuracy, and safety. In RLHF, reward models provide feedback signals during RL training. They can suffer from "reward hacking" where LLMs achieve high scores without actual quality improvement.
Constitutional AI (CAI)
Anthropic's alignment technique using written principles (a "constitution") to train systems to be helpful, harmless, and honest. CAI combines self-critique, revision, and RLAIF—enabling scalable oversight with transparent, adjustable values.
Reference: Bai, Y. et al., "Constitutional AI: Harmlessness from AI Feedback", Anthropic 2022
Efficient Training
LoRA (Low-Rank Adaptation)
Parameter-efficient fine-tuning injecting small trainable matrices into transformer layers while freezing pre-trained weights. Trains only 0.1-1% of parameters while achieving comparable performance to full fine-tuning. Makes fine-tuning accessible on consumer hardware.
QLoRA (Quantized LoRA)
Combines 4-bit quantization with LoRA, enabling fine-tuning massive models on a single GPU. Keeps base model in 4-bit precision while training adapters in higher precision. Achieves performance comparable to full 16-bit fine-tuning with dramatically reduced memory.
PEFT (Parameter-Efficient Fine-Tuning)
Umbrella term for techniques fine-tuning only small parameter subsets—LoRA, adapters, prefix tuning, prompt tuning. Benefits include reduced costs (10-100x), smaller checkpoints (MBs vs GBs), faster training, and lower overfitting risk.
Reference: Houlsby, N. et al., "Parameter-Efficient Transfer Learning for NLP", ICML 2019
Quantization
Reducing weight precision from 32/16-bit floating point to 8/4-bit integers. Dramatically reduces memory and speeds inference with minimal accuracy loss. Essential for deploying LLMs at scale and on edge devices.
Knowledge Distillation
Compression technique where smaller "student" models learn to mimic larger "teacher" models by learning from probability distributions (soft labels) rather than hard labels. NVIDIA's Minitron shows 40x reduction in training tokens compared to from-scratch training.
Pruning
Removing unnecessary parameters to reduce size and inference cost. Structured pruning removes entire layers or attention heads; unstructured pruning zeroes individual weights. LLMs can be compressed 2-4x while maintaining performance.
Data Concepts
Training Data
Examples teaching models patterns and behaviors. Quality, diversity, and representativeness fundamentally determine capabilities and biases. Frontier models train on hundreds of billions to trillions of tokens.
Reference: Brown, T. et al., "Language Models are Few-Shot Learners" (GPT-3), NeurIPS 2020
Synthetic Data
Training data generated by AI rather than collected from real sources. Addresses data scarcity, privacy, and annotation costs. Used for fine-tuning, alignment, and pre-training augmentation. Concerns include "model collapse" and bias amplification.
Data Contamination
When evaluation benchmark data appears in training sets, inflating performance scores. Studies show models achieve 4.9x higher scores on leaked samples. Detection methods include membership inference and perplexity analysis.
Memorization
When LLMs reproduce verbatim training sequences rather than generating novel text. Raises privacy concerns (personal information, contact details) and copyright issues. Research shows over 1% of outputs can be copied verbatim.
Reference: Carlini, N. et al., "Extracting Training Data from Large Language Models", USENIX Security 2021
Deduplication
Removing duplicate content from training datasets—critical because duplicated data increases memorization, reduces efficiency, and inflates test scores. Modern pipelines remove 20-30% of raw web data as redundant.
Reference: Lee, K. et al., "Deduplicating Training Data Makes Language Models Better", ACL 2022
Data Poisoning
Security attack inserting malicious data into training sets to compromise model behavior. Anthropic research (2025) showed just 250 poisoned documents can backdoor LLMs regardless of model size.
Reference: Wallace, E. et al., "Concealed Data Poisoning Attacks on NLP Models", NAACL 2021
Dataset Bias
Systematic training data patterns leading to unfair model behaviors—demographic underrepresentation, historical prejudices, selection bias. Manifests as stereotyping, discriminatory recommendations, or disparate performance across groups.
This glossary is part of a series covering AI and LLM concepts for product designers. Terms without authoritative references are noted for tracking.