April 15, 2021

498 words 3 mins read



The Compositional Perturbation Autoencoder (CPA) is a deep generative framework to learn effects of perturbations at the single-cell level. CPA performs OOD predictions of unseen combinations of drugs, learns interpretable embeddings, estimates dose-response curves, and provides uncertainty estimates.

repo name facebookresearch/CPA
repo link https://github.com/facebookresearch/CPA
language Jupyter Notebook
size (curr.) 46733 kB
stars (curr.) 54
created 2021-04-08
license MIT License

CPA - Compositional Perturbation Autoencoder

CPA is a collaborative research project from Facebook AI Research (FAIR) and the computational biology group of Prof. Fabian Theis (https://github.com/theislab) from Helmholtz Zentrum München.

What is CPA?


CPA is a deep generative framework to learn effects of perturbations at the single-cell level. CPA encodes and learns phenotypic drug response across different cell types, doses and drug combinations. CPA allows:

  • Out-of-distribution predicitons of unseen drug combinations at various doses and among different cell types.
  • Learn interpretable drug and cell type latent spaces.
  • Estimate dose response curve for each perturbation and their combinations.
  • Access the uncertainty of the estimations of the model.

Package Structure

The repository is centered around the compert module:

Additional files and folders:

  • datasets contains both versions of the data: raw and pre-processed.
  • preprocessing contains notebooks to reproduce the datasets pre-processing from raw data.
  • notebooks contains notebooks to reproduce plots from the paper and detailed analysis of each of the datasets.
  • pretrained_models contains best models selected after the sweeps. These models were used for the analysis and figures in the paper.
  • scripts contains bash files for automatic running of the model.


As a first step, download the contents of datasets/ and pretrained_models/ from this tarball.

To learn how to use this repository, check ./notebooks/demo.ipynb, and the following scripts:

Examples and Reproducibility

All the examples and the reproducbility notebooks for the plots in the paper could be found in the notebooks/ folder.

Training a model

There are two ways to train a compert model:

  • Using the command line, e.g.: python -m compert.train --dataset_path datasets/GSM_new.h5ad --save_dir /tmp --max_epochs 1 --doser_type sigm
  • From jupyter notebook: example in ./notebooks/demo.ipynb


Run python ./scripts/run_one_epoch.sh to perfrom automatic testing for one epoch of all the datasets used in the study.




Currently you can access the documentation via help function in IPython. For example:

from compert.api import ComPertAPI


from compert.plotting import CompertVisuals


A separate page with the documentation is coming soon.

Support and contribute

If you have a question or noticed a problem, you can post an issue.





This source code is released under the MIT license, included here.

comments powered by Disqus