graykode/nlp-tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
repo name | graykode/nlp-tutorial |
repo link | https://github.com/graykode/nlp-tutorial |
homepage | https://www.reddit.com/r/MachineLearning/comments/amfinl/project_nlptutoral_repository_who_is_studying/ |
language | Jupyter Notebook |
size (curr.) | 315 kB |
stars (curr.) | 5340 |
created | 2019-01-09 |
license | MIT License |
nlp-tutorial
nlp-tutorial
is a tutorial for who is studying NLP(Natural Language Processing) using TensorFlow and Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank lines)
Curriculum - (Example Purpose)
1. Basic Embedding Model
- 1-1. NNLM(Neural Network Language Model) - Predict Next Word
- Paper - A Neural Probabilistic Language Model(2003)
- Colab - NNLM_Tensor.ipynb, NNLM_Torch.ipynb
- 1-2. Word2Vec(Skip-gram) - Embedding Words and Show Graph
- 1-3. FastText(Application Level) - Sentence Classification
- Paper - Bag of Tricks for Efficient Text Classification(2016)
- Colab - FastText.ipynb
2. CNN(Convolutional Neural Network)
- 2-1. TextCNN - Binary Sentiment Classification
- 2-2. DCNN(Dynamic Convolutional Neural Network)
3. RNN(Recurrent Neural Network)
- 3-1. TextRNN - Predict Next Step
- Paper - Finding Structure in Time(1990)
- Colab - TextRNN_Tensor.ipynb, TextRNN_Torch.ipynb
- 3-2. TextLSTM - Autocomplete
- Paper - LONG SHORT-TERM MEMORY(1997)
- Colab - TextLSTM_Tensor.ipynb, TextLSTM_Torch.ipynb
- 3-3. Bi-LSTM - Predict Next Word in Long Sentence
- Colab - Bi_LSTM_Tensor.ipynb, Bi_LSTM_Torch.ipynb
4. Attention Mechanism
- 4-1. Seq2Seq - Change Word
- 4-2. Seq2Seq with Attention - Translate
- 4-3. Bi-LSTM with Attention - Binary Sentiment Classification
5. Model based on Transformer
- 5-1. The Transformer - Translate
- 5-2. BERT - Classification Next Sentence & Predict Masked Tokens
Model | Example | Framework | Lines(torch/tensor) |
---|---|---|---|
NNLM | Predict Next Word | Torch, Tensor | 67/83 |
Word2Vec(Softmax) | Embedding Words and Show Graph | Torch, Tensor | 77/94 |
TextCNN | Sentence Classification | Torch, Tensor | 94/99 |
TextRNN | Predict Next Step | Torch, Tensor | 70/88 |
TextLSTM | Autocomplete | Torch, Tensor | 73/78 |
Bi-LSTM | Predict Next Word in Long Sentence | Torch, Tensor | 73/78 |
Seq2Seq | Change Word | Torch, Tensor | 93/111 |
Seq2Seq with Attention | Translate | Torch, Tensor | 108/118 |
Bi-LSTM with Attention | Binary Sentiment Classification | Torch, Tensor | 92/104 |
Transformer | Translate | Torch | 222/0 |
Greedy Decoder Transformer | Translate | Torch | 246/0 |
BERT | how to train | Torch | 242/0 |
Dependencies
- Python 3.5+
- Tensorflow 1.12.0+
- Pytorch 0.4.1+
- Plan to add Keras Version
Author
- Tae Hwan Jung(Jeff Jung) @graykode
- Author Email : nlkey2022@gmail.com
- Acknowledgements to mojitok as NLP Research Internship.