December 12, 2019

876 words 5 mins read



Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

repo name pytorch/fairseq
repo link
language Python
size (curr.) 6663 kB
stars (curr.) 7113
created 2017-08-29
license MIT License

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

What’s New:


Fairseq provides reference implementations of various sequence-to-sequence models, including:


  • multi-GPU (distributed) training on one machine or across multiple machines
  • fast generation on both CPU and GPU with multiple search algorithms implemented:
  • large mini-batch training even on a single GPU via delayed updates
  • mixed precision training (trains faster with less GPU memory on NVIDIA tensor cores)
  • extensible: easily register new models, criterions, tasks, optimizers and learning rate schedulers

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.


Requirements and Installation

  • PyTorch version >= 1.2.0
  • Python version >= 3.6
  • For training new models, you’ll also need an NVIDIA GPU and NCCL
  • For faster training install NVIDIA’s apex library with the --cuda_ext and --deprecated_fused_adam options

To install fairseq:

pip install fairseq

On MacOS:

CFLAGS="-stdlib=libc++" pip install fairseq

If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run.

Installing from source

To install fairseq from source and develop locally:

git clone
cd fairseq
pip install --editable .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

  • Translation: convolutional and transformer models are available
  • Language Modeling: convolutional and transformer models are available
  • wav2vec: wav2vec large model is available

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community


fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.


Please cite as:

  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
comments powered by Disqus