Scheduler Type | When to Use | Behavior | Code Example (PyTorch / TensorFlow) |
Fixed LR | For simple tasks or as a baseline. | Constant LR throughout training. | optimizer = Adam(lr=0.001) |
Step Decay | When loss plateaus after certain epochs. | LR drops by a factor every n epochs. | StepLR(optimizer, step_size=10, gamma=0.1) |
Exponential Decay | When you want a smooth, steady LR decrease. | LR decays exponentially. | ExponentialLR(optimizer, gamma=0.95) |
Polynomial Decay | When you want controlled slowdown with a target final LR. | LR drops based on a polynomial curve. | PolynomialDecay(initial_lr, decay_steps, end_lr, power) (TF) |
Cosine Annealing | To avoid sharp local minima; good for large-scale training. | LR decreases following a cosine curve, then restarts. | CosineAnnealingLR(optimizer, T_max=50) |
ReduceLROnPlateau | When val loss stagnates (great for overfitting/early stopping cases). | LR reduces when a monitored metric stops improving. | ReduceLROnPlateau(optimizer, mode=’min’, factor=0.2, patience=5) |
Cyclic LR | To explore a range of LRs and potentially escape local minima. | LR oscillates between two boundaries. | CyclicLR(base_lr=1e-4, max_lr=1e-2, step_size_up=2000) |
One Cycle Policy | Highly recommended for fast convergence and generalization. | LR increases, then drops sharply; momentum is adjusted inversely. | learner.fit_one_cycle(10, max_lr=1e-2) (fastai) |
Warm Restarts (Cosine) | For training on very large datasets, to periodically refresh training dynamics. | Cosine decay with periodic restarts. | CosineAnnealingWarmRestarts(optimizer, T_0=10, T_mult=2) |