ConvTranspose2d bias

Hi,
When we use ConvTranspose2d function, I noticed the bias is False.
Do we always put it as false?

nn.ConvTranspose2d(latent_size, 512, kernel_size=4, stride=1, padding=0, bias=False),

Thank you.

No.

This is just for this particular model (I assume GAN).

The biases in GAN’s generator model are usually disabled to avoid mode collapse.
If the bias is enabled, then the generator can learn to ignore the inputs (latent space vector/matrix/whatever) and just generate a single image (looking very realistically) out of bias.
The discriminator then has a problem, because the image generated becomes indistinguishable from a real one. It can’t give any feedback how to improve the image, because it sees no difference between real and generated images. The generator WINS
But the generator learned to generate only one image, which is not what you want from this tandem.

Hi Sebastian,
That make sense.
I actually got a case where after around 30 epochs or so the real score is almost 1 and fake score is almost 0. Is this the case where “the generator wins”?

Epoch [4/10], loss_g: 10.7822, loss_d: 0.0241, real_score: 0.9853, fake_score: 0.0081

What should we do in this case? Should we train again from scratch with different parameters?
I used the same code/parameters as the jovian notebook, not sure how this happened.

Thanks,
William

If loss_g is bigger, then the generator isn’t actually doing well (the lower the better).

If generator is underperforming, you can do many things:

  • increase complexity of the generator
  • increase complexity of the discriminator → as weird as it seems, better discriminator might provide “stronger” feedback how to improve the image (but also outperform the generator so it might get stuck again)
  • decrease learning rate (perhaps the generator got stuck somewhere because of momentum)
  • change the loss function - here you might look at thing called WGAN-GP

Also, read this.

And just sometimes, the images are just too hard for the generator (or at least generators that we can achieve without some expensive powerful sets of GPUs)

1 Like

Oh great let me try them out, thank you sebastian

Hi Sebastian,
I made latent size 256, added drop outs and implemented WGAN-GP loss function as you suggested.
Training takes a lot longer now (7.5mins vs 5mins) but at least loss is looking better.
I only trained 10 epochs but at least it is only fluctuating between 0.75 - 0.9 in real score.

Thanks,
William