February 26, 2021

2152 words 11 mins read

POSTECH-CVLab/PyTorch-StudioGAN

POSTECH-CVLab/PyTorch-StudioGAN

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

repo name POSTECH-CVLab/PyTorch-StudioGAN
repo link https://github.com/POSTECH-CVLab/PyTorch-StudioGAN
homepage
language Python
size (curr.) 13217 kB
stars (curr.) 1435
created 2020-06-19
license Other

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation. StudioGAN aims to offer an identical playground for modern GANs so that machine learning researchers can readily compare and analyze a new idea.

Features

  • Extensive GAN implementations for PyTorch
  • Comprehensive benchmark of GANs using CIFAR10, Tiny ImageNet, and ImageNet datasets
  • Better performance and lower memory consumption than original implementations
  • Providing pre-trained models that are fully compatible with up-to-date PyTorch environment
  • Support Multi-GPU (DP, DDP, and Multinode DistributedDataParallel), Mixed Precision, Synchronized Batch Normalization, LARS, Tensorboard Visualization, and other analysis methods

Implemented GANs

Name Venue Architecture GC DC Loss EMA
DCGAN arXiv'15 CNN/ResNet[1] N/A N/A Vanilla False
LSGAN ICCV'17 CNN/ResNet[1] N/A N/A Least Sqaure False
GGAN arXiv'17 CNN/ResNet[1] N/A N/A Hinge False
WGAN-WC ICLR'17 ResNet N/A N/A Wasserstein False
WGAN-GP NIPS'17 ResNet N/A N/A Wasserstein False
WGAN-DRA arXiv'17 ResNet N/A N/A Wasserstein False
ACGAN-Mod[2] - ResNet cBN AC Hinge False
ProjGAN ICLR'18 ResNet cBN PD Hinge False
SNGAN ICLR'18 ResNet cBN PD Hinge False
SAGAN ICML'19 ResNet cBN PD Hinge False
BigGAN-Mod[3] - Big ResNet cBN PD Hinge True
BigGAN-Deep-Mod[3] - Big ResNet Deep cBN PD Hinge True
CRGAN ICLR'20 Big ResNet cBN PD/CL Hinge True
ICRGAN arXiv'20 Big ResNet cBN PD/CL Hinge True
LOGAN arXiv'19 Big ResNet cBN PD Hinge True
DiffAugGAN Neurips'20 Big ResNet cBN PD/CL Hinge True
ADAGAN Neurips'20 Big ResNet cBN PD/CL Hinge True
ContraGAN Neurips'20 Big ResNet cBN CL Hinge True
FreezeD CVPRW'20 - - - - -

GC/DC indicates the way how we inject label information to the Generator or Discriminator.

EMA: Exponential Moving Average update to the generator. cBN : conditional Batch Normalization. AC : Auxiliary Classifier. PD : Projection Discriminator. CL : Contrastive Learning.

To be Implemented

Name Venue Architecture GC DC Loss EMA
StyleGAN2 CVPR' 20 StyleNet AdaIN - Vanilla True

AdaIN : Adaptive Instance Normalization.

Requirements

Please refer to requirements.md for more information.

You can install the recommended environment as follows:

conda env create -f environment.yml -n studiogan

With docker, you can use:

docker pull mgkang/studiogan:latest

This is my command to make a container named “studioGAN”.

Also, you can use port number 6006 to connect the tensoreboard.

docker run -it --gpus all --shm-size 128g -p 6006:6006 --name studioGAN -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash

Quick Start

  • Train (-t) and evaluate (-e) the model defined in CONFIG_PATH using GPU 0
CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -c CONFIG_PATH
  • Train (-t) and evaluate (-e) the model defined in CONFIG_PATH using GPUs (0, 1, 2, 3) and DataParallel
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -c CONFIG_PATH

Try python3 src/main.py to see available options.

Via Tensorboard, you can monitor trends of IS, FID, F_beta, Authenticity Accuracies, and the largest singular values:

~ PyTorch-StudioGAN/logs/RUN_NAME>>> tensorboard --logdir=./ --port PORT

Dataset

  • CIFAR10: StudioGAN will automatically download the dataset once you execute main.py.

  • Tiny Imagenet, Imagenet, or a custom dataset:

    1. download Tiny Imagenet and Imagenet. Prepare your own dataset.
    2. make the folder structure of the dataset as follows:
┌── docs
├── src
└── data
    └── ILSVRC2012 or TINY_ILSVRC2012 or CUSTOM
        ├── train
        │   ├── cls0
        │   │   ├── train0.png
        │   │   ├── train1.png
        │   │   └── ...
        │   ├── cls1
        │   └── ...
        └── valid
            ├── cls0
            │   ├── valid0.png
            │   ├── valid1.png
            │   └── ...
            ├── cls1
            └── ...

Supported Training Techniques

  • DistributedDataParallel (Please refer to Here)
    ### NODE_0, 4_GPUs, All ports are open to NODE_1
    docker run -it --gpus all --shm-size 128g --name studioGAN --network=host -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash
    
    ~/code>>> export NCCL_SOCKET_IFNAME=^docker0,lo
    ~/code>>> export MASTER_ADDR=PUBLIC_IP_OF_NODE_0
    ~/code>>> export MASTER_PORT=AVAILABLE_PORT_OF_NODE_0
    
    ~/code/PyTorch-StudioGAN>>> CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -DDP -n 2 -nr 0 -c CONFIG_PATH
    
    ### NODE_1, 4_GPUs, All ports are open to NODE_0
    docker run -it --gpus all --shm-size 128g --name studioGAN --network=host -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash
    
    ~/code>>> export NCCL_SOCKET_IFNAME=^docker0,lo
    ~/code>>> export MASTER_ADDR=PUBLIC_IP_OF_NODE_0
    ~/code>>> export MASTER_PORT=AVAILABLE_PORT_OF_NODE_0
    
    ~/code/PyTorch-StudioGAN>>> CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -DDP -n 2 -nr 1 -c CONFIG_PATH
    

※ StudioGAN does not support DDP training for ContraGAN. This is because conducting contrastive learning requires a ‘gather’ operation to calculate the exact conditional contrastive loss.

  • Mixed Precision Training
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -mpc -c CONFIG_PATH
    
  • Standing Statistics
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -std_stat --standing_step STANDING_STEP -c CONFIG_PATH
    
  • Synchronized BatchNorm
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -sync_bn -c CONFIG_PATH
    
  • Load All Data in Main Memory
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -l -c CONFIG_PATH
    
  • LARS
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -l -c CONFIG_PATH -LARS
    

Analyzing Generated Images

The StudioGAN supports Image visualization, K-nearest neighbor analysis, Linear interpolation, and Frequency analysis. All results will be saved in ./figures/RUN_NAME/*.png.

  • Image Visualization
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -iv -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
  • K-Nearest Neighbor Analysis (we have fixed K=7, the images in the first column are generated images.)
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -knn -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
  • Linear Interpolation (applicable only to conditional Big ResNet models)
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -itp -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
  • Frequency Analysis
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -fa -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH
  • TSNE Analysis
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -tsne -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

Metrics

Inception Score (IS)

Inception Score (IS) is a metric to measure how much GAN generates high-fidelity and diverse images. Calculating IS requires the pre-trained Inception-V3 network, and recent approaches utilize OpenAI’s TensorFlow implementation.

To compute official IS, you have to make a “samples.npz” file using the command below:

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -s -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

It will automatically create the samples.npz file in the path ./samples/RUN_NAME/fake/npz/samples.npz. After that, execute TensorFlow official IS implementation. Note that we do not split a dataset into ten folds to calculate IS ten times. We use the entire dataset to compute IS only once, which is the evaluation strategy used in the CompareGAN repository.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/inception_tf13.py --run_name RUN_NAME --type "fake"

Keep in mind that you need to have TensorFlow 1.3 or earlier version installed!

Note that StudioGAN logs Pytorch-based IS during the training.

Frechet Inception Distance (FID)

FID is a widely used metric to evaluate the performance of a GAN model. Calculating FID requires the pre-trained Inception-V3 network, and modern approaches use Tensorflow-based FID. StudioGAN utilizes the PyTorch-based FID to test GAN models in the same PyTorch environment. We show that the PyTorch based FID implementation provides almost the same results with the TensorFlow implementation (See Appendix F of our paper).

Precision and Recall (PR: F_1/8=Weights Precision, F_8=Weights Recall)

Precision measures how accurately the generator can learn the target distribution. Recall measures how completely the generator covers the target distribution. Like IS and FID, calculating Precision and Recall requires the pre-trained Inception-V3 model. StudioGAN uses the same hyperparameter settings with the original Precision and Recall implementation, and StudioGAN calculates the F-beta score suggested by Sajjadi et al.

Benchmark

※ We always welcome your contribution if you find any wrong implementation, bug, and misreported score.

We report the best IS, FID, and F_beta values of various GANs. B. S. means batch size for training.

CR, ICR, DiffAug, ADA, and LO refer to regularization or optimization techiniques: CR (Consistency Regularization), ICR (Improved Consistency Regularization), DiffAug (Differentiable Augmentation), ADA (Adaptive Discriminator Augmentation), and LO (Latent Optimization), respectively.

CIFAR10 (3x32x32)

When training, we used the command below.

With a single TITAN RTX GPU, training BigGAN takes about 13-15 hours.

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "test"
Name Batch IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
DCGAN 64 6.638 49.030 0.833 0.795 Config Log Link
LSGAN 64 5.577 66.686 0.757 0.720 Config Log Link
GGAN 64 6.227 42.714 0.916 0.822 Config Log Link
WGAN-WC 64 2.579 159.090 0.190 0.199 Config Log Link
WGAN-GP 64 7.458 25.852 0.962 0.929 Config Log Link
WGAN-DRA 64 6.432 41.586 0.922 0.863 Config Log Link
ACGAN-Mod 64 6.629 45.571 0.857 0.847 Config Log Link
ProjGAN 64 7.539 33.830 0.952 0.855 Config Log Link
SNGAN 64 8.677 13.248 0.983 0.978 Config Log Link
SAGAN 64 8.680 14.009 0.982 0.970 Config Log Link
BigGAN[4] 2048 9.22[8] 14.73 - - - - -
BigGAN + CR[5] 64 - 11.5 - - - - -
BigGAN + ICR[6] 64 - 9.2 - - - - -
BigGAN + DiffAug[7] 64 9.2[8] 8.7 - - - - -
BigGAN-Mod 64 9.746 8.034 0.995 0.994 Config Log Link
BigGAN-Mod + CR 64 10.380 7.178 0.994 0.993 Config Log Link
BigGAN-Mod + ICR 64 10.153 7.430 0.994 0.993 Config Log Link
BigGAN-Mod + DiffAug 64 9.775 7.157 0.996 0.993 Config Log Link
BigGAN-Mod + ADA 64 10.136 7.881 0.993 0.994 Config Log Link
BigGAN-Mod + LO 64 9.701 8.369 0.992 0.989 Config Log Link
ContraGAN 64 9.729 8.065 0.993 0.992 Config Log Link
ContraGAN + CR 64 9.812 7.685 0.995 0.993 Config Log Link
ContraGAN + ICR 64 10.117 7.547 0.996 0.993 Config Log Link
ContraGAN + DiffAug 64 9.996 7.193 0.995 0.990 Config Log Link
ContraGAN + ADA 64 9.411 10.830 0.990 0.964 Config Log Link

When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

IS, FID, and F_beta values are computed using 10K test and 10K generated Images.

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "test"

Tiny ImageNet (3x64x64)

When training, we used the command below.

With 4 TITAN RTX GPUs, training BigGAN takes about 2 days.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "valid"
Name Batch IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
DCGAN 256 5.640 91.625 0.606 0.391 Config Log Link
LSGAN 256 5.381 90.008 0.638 0.390 Config Log Link
GGAN 256 5.146 102.094 0.503 0.307 Config Log Link
WGAN-WC 256 9.696 41.454 0.940 0.735 Config Log Link
WGAN-GP 256 1.322 311.805 0.016 0.000 Config Log Link
WGAN-DRA 256 9.564 40.655 0.938 0.724 Config Log Link
ACGAN-Mod 256 6.342 78.513 0.668 0.518 Config Log Link
ProjGAN 256 6.224 89.175 0.626 0.428 Config Log Link
SNGAN 256 8.412 53.590 0.900 0.703 Config Log Link
SAGAN 256 8.342 51.414 0.898 0.698 Config Log Link
BigGAN-Mod 1024 11.998 31.920 0.956 0.879 Config Log Link
BigGAN-Mod + CR 1024 14.887 21.488 0.969 0.936 Config Log Link
BigGAN-Mod + ICR 1024 5.605 91.326 0.525 0.399 Config Log Link
BigGAN-Mod + DiffAug 1024 17.075 16.338 0.979 0.971 Config Log Link
BigGAN-Mod + ADA 1024 15.158 24.121 0.953 0.942 Config Log Link
BigGAN-Mod + LO 256 6.964 70.660 0.857 0.621 Config Log Link
ContraGAN 1024 13.494 27.027 0.975 0.902 Config Log Link
ContraGAN + CR 1024 15.623 19.716 0.983 0.941 Config Log Link
ContraGAN + ICR 1024 15.830 21.940 0.980 0.944 Config Log Link
ContraGAN + DiffAug 1024 17.303 15.755 0.984 0.962 Config Log Link
ContraGAN + ADA 1024 8.398 55.025 0.878 0.677 Config Log Link

When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

IS, FID, and F_beta values are computed using 10K validation and 10K generated Images.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

ImageNet (3x128x128)

When training, we used the command below.

With 8 TESLA V100 GPUs, training BigGAN2048 takes about a month.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -sync_bn -stat_otf -c CONFIG_PATH --eval_type "valid"
Name Batch IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
SNGAN 256 32.247 26.792 0.938 0.913 Config Log Link
SAGAN 256 29.848 34.726 0.849 0.914 Config Log Link
BigGAN[4] 2048 98.8[8] 8.7 - - - - -
BigGAN-Mod 256 28.633 24.684 0.941 0.921 Config Log Link
BigGAN-Mod 2048 99.705 7.893 0.985 0.989 Config Log Link
ContraGAN 256 25.249 25.161 0.947 0.855 Config Log Link

When evaluating, the statistics of batch normalization layers are calculated in advance (moving average of the previous statistics).

IS, FID, and F_beta values are computed using 50K validation and 50K generated Images.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -sync_bn -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

StudioGAN thanks the following Repos for the code sharing

Exponential Moving Average: https://github.com/ajbrock/BigGAN-PyTorch

Synchronized BatchNorm: https://github.com/vacancy/Synchronized-BatchNorm-PyTorch

Self-Attention module: https://github.com/voletiv/self-attention-GAN-pytorch

Implementation Details: https://github.com/ajbrock/BigGAN-PyTorch

Architecture Details: https://github.com/google/compare_gan

DiffAugment: https://github.com/mit-han-lab/data-efficient-gans

Adaptive Discriminator Augmentation: https://github.com/rosinality/stylegan2-pytorch

Tensorflow IS: https://github.com/openai/improved-gan

Tensorflow FID: https://github.com/bioinf-jku/TTUR

Pytorch FID: https://github.com/mseitzer/pytorch-fid

Tensorflow Precision and Recall: https://github.com/msmsajjadi/precision-recall-distributions

torchlars: https://github.com/kakaobrain/torchlars

Citation

StudioGAN is established for the following research project. Please cite our work if you use StudioGAN.

@inproceedings{kang2020ContraGAN,
  title   = {{ContraGAN: Contrastive Learning for Conditional Image Generation}},
  author  = {Minguk Kang and Jaesik Park},
  journal = {Conference on Neural Information Processing Systems (NeurIPS)},
  year    = {2020}
}

[1] Experiments on Tiny ImageNet are conducted using the ResNet architecture instead of CNN.

[2] Our re-implementation of ACGAN (ICML'17) with slight modifications, which bring strong performance enhancement for the experiment using CIFAR10.

[3] Our re-implementation of BigGAN/BigGAN-Deep (ICLR'18) with slight modifications, which bring strong performance enhancement for the experiment using CIFAR10.

[4] BigGAN/BigGAN-Deep (ICLR'18)

[5] BigGAN + CR (ICLR'20)

[6] BigGAN + ICR (arXiv'20)

[7] BigGAN + DiffAug (Neurips'20)

[8] IS is computed using Tensorflow official code.

comments powered by Disqus