Authors official implementation of "Big GANs Are Watching You" pre-print
|size (curr.)||4197 kB|
Big GANs Are Watching You
This repository is the official implementation of Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models by Andrey Voynov, Stanislav Morozov, and Artem Babenko.
Schematic representation of our approach.
- Find the BigBiGAN latent direction responsible for background darkening and foreground lightening;
- Sample real images embeddings and generate segmentation masks with their shifts;
- Train U-net with that synthetic data.
example of variation along the latent direction
python 3.6 or later
at least 12Gb of GPU-memory for batch size 95 (though smaller batch gives competitive result)
For results visualization:
Optional for BigBiGAN tf to torch conversion and BigBiGAN embeddings evaluation:
if troubles – check the authors packages versions in
this scrips downloads:
- BigBiGAN weights of the original tf-hub model (https://tfhub.dev/deepmind/bigbigan-resnet50/1) ported to pytorch;
- precalculated BigBiGAN-embeddings for some popular datasets;
- background darkening and foreground lightening latent directions coefficients.
note that the ImageNet-BigBiGAN-embeddings takes 551Mb.
To train the U-net segmentation model, run this command:
python train_segmentation.py \ --out "results" \ --gan_weights "BigGAN/weights/BigBiGAN_x1.pth" \ --z "embeddings/BigBiGAN_ImageNet_z.npy" --bg_direction "BigGAN/weights/bg_direction.pth" \ --val_images_dir __path_to_images_dir__ \ --val_masks_dir __path_to_masks_dir__ \
Once you want to use your own images BigBiGAN-embeddings, please run
bigbigan_embeddings.ipynb notebook and provide your own data path.
To check the synthetic data / segmentation samples / model metrics, please run
You can download pretrained models here:
- U-net trained on ImageNet-BigBiGAN embeddings using default parameters.
- U-net trained on CUB-200-2011-BigBiGAN embeddings using default parameters.
- U-net trained on Flowers-BigBiGAN embeddings using default parameters.
Our model achieves state-of-the-art results among fully unsupervised methods on many datasets. Check Tables 1, 2, 3 of the original paper for more details. In the tables below we refer our approach as BigBiGAN and E-BigBiGAN.
Here are some samples of the model performing on DUTS dataset:
BigBiGAN weights conversion
Once the original BigBiGAN weigths are distributed within tfhub, we also provide the conversion code. See
bigbigan2biggan.ipynb. This code is based on this script: https://github.com/ajbrock/BigGAN-PyTorch/blob/master/TFHub/converter.py . Note that BigBiGAN and BigGAN almost share the architecture and we use the author’s officially unofficial PyTorch BigGAN implementation.
- Unsupervised Discovery of Latent Directions: https://github.com/anvoynov/GANLatentDiscovery
- BigBiGAN weights: https://tfhub.dev/deepmind/bigbigan-resnet50/1
- BigGAN pytorch: https://github.com/ajbrock/BigGAN-PyTorch
- U-Net model code is based on: https://github.com/milesial/Pytorch-UNet
- prefetch_generator: https://github.com/justheuristic/prefetch_generator