April 10, 2020

238 words 2 mins read



Public runnable examples of using John Snow Labs' NLP for Apache Spark.

repo name JohnSnowLabs/spark-nlp-workshop
repo link https://github.com/JohnSnowLabs/spark-nlp-workshop
language Jupyter Notebook
size (curr.) 280371 kB
stars (curr.) 155
created 2018-08-20
license Apache License 2.0

Spark NLP Workshop

Build Status Maven Central PyPI version Anaconda-Cloud License

Showcasing notebooks and codes of how to use Spark NLP in Python and Scala.

Table of contents

Python Setup

python3.7 -m virtualenv spark-nlp-2-4-4

Docker setup

If you want to experience Spark NLP and run Jupyter examples without installing anything, you can simply use our Docker image:

1- Get the docker image for spark-nlp-workshop:

docker pull johnsnowlabs/spark-nlp-workshop

2- Run the image locally with port binding.

 docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop

3- Open Jupyter notebooks inside your browser by using the token printed on the console.

  • The password to Jupyter notebook is sparknlp
  • The size of the image grows everytime you download a pretrained model or a pretrained pipeline. You can cleanup ~/cache_pretrained if you don’t need them.
  • This docker image is only meant for testing/learning purposes and should not be used in production environments. Please install Spark NLP natively.

Main repository


Project’s website

Take a look at our official spark-nlp page: http://nlp.johnsnowlabs.com/ for user documentation and examples

Slack community channel

Join Slack


If you find any example that is no longer working, please create an issue.


Apache Licence 2.0

comments powered by Disqus