How to choose best LR Scheduler for training Deep Neural Network on a dataset logically?

Choosing the Best Learning Rate Scheduler – At a Glance

 

Scheduler Type

When to Use

Behavior

Code Example (PyTorch / TensorFlow)

Fixed LR

For simple tasks or as a baseline.

Constant LR throughout training.

optimizer = Adam(lr=0.001)

Step Decay

When loss plateaus after certain epochs.

LR drops by a factor every n epochs.

StepLR(optimizer, step_size=10, gamma=0.1)

Exponential Decay

When you want a smooth, steady LR decrease.

LR decays exponentially.

ExponentialLR(optimizer, gamma=0.95)

Polynomial Decay

When you want controlled slowdown with a target final LR.

LR drops based on a polynomial curve.

PolynomialDecay(initial_lr, decay_steps, end_lr, power) (TF)

Cosine Annealing

To avoid sharp local minima; good for large-scale training.

LR decreases following a cosine curve, then restarts.

CosineAnnealingLR(optimizer, T_max=50)

ReduceLROnPlateau

When val loss stagnates (great for overfitting/early stopping cases).

LR reduces when a monitored metric stops improving.

ReduceLROnPlateau(optimizer, mode=’min’, factor=0.2, patience=5)

Cyclic LR

To explore a range of LRs and potentially escape local minima.

LR oscillates between two boundaries.

CyclicLR(base_lr=1e-4, max_lr=1e-2, step_size_up=2000)

One Cycle Policy

Highly recommended for fast convergence and generalization.

LR increases, then drops sharply; momentum is adjusted inversely.

learner.fit_one_cycle(10, max_lr=1e-2) (fastai)

Warm Restarts (Cosine)

For training on very large datasets, to periodically refresh training dynamics.

Cosine decay with periodic restarts.

CosineAnnealingWarmRestarts(optimizer, T_0=10, T_mult=2)

Recommended Strategy

  1. Start Simple
    Use Fixed LR or Step Decay to establish a baseline.

  2. Monitor Training Behaviour

    • If validation loss plateaus → Try Step Decay or ReduceLROnPlateau.

    • If loss is noisy or erratic → Use Cosine Annealing or Cyclic LR.

    • If training is slow → Try One Cycle or LR Finder to discover optimal LR range.

  3. Use LR Finder (optional)
    Identify optimal LR range visually before selecting a scheduler (fastai and PyTorch Lightning support this).