Adding Dense Layer to Bayesian CNN causes model to stop learning

Use a Bayesian Dense Layer

Use DenseVariational from tensorflow_probability.layers, not standard Dense.

from tensorflow_probability import layers as tfpl

 

model.add(tfpl.DenseVariational(

    units=128,

    make_prior_fn=prior,

    make_posterior_fn=posterior,

    kl_weight=1/num_train_samples,

    activation=’relu’

))

 

Adjust kl_weight

  • This controls how much the KL divergence contributes to total loss.

  • Too high → KL dominates, model underfits.

  • Rule of thumb: kl_weight = 1 / num_train_samples

Use Proper Prior and Posterior Functions

Avoid overly narrow priors like zero-mean with tiny stddev unless you’re sure.

Example:

python

Copy

def prior_fn(dtype, shape):

    return tfp.distributions.Independent(

        tfp.distributions.Normal(loc=tf.zeros(shape, dtype=dtype), scale=1.0),

        reinterpreted_batch_ndims=1)

 

Monitor KL Loss

Print both the negative log-likelihood (NLL) and KL divergence:

python

Copy

loss = nll + kl_divergence

 

If KL is much larger than NLL early in training, your kl_weight is likely too high.