explosion/projects
End-to-end NLP workflows from prototype to production
repo name | explosion/projects |
repo link | https://github.com/explosion/projects |
homepage | https://nightly.spacy.io/usage/projects |
language | Jupyter Notebook |
size (curr.) | 9884 kB |
stars (curr.) | 358 |
created | 2019-11-21 |
license | MIT License |
🪐 Project Templates
spaCy projects let you manage and share end-to-end spaCy workflows for different use cases and domains, and orchestrate training, packaging and serving your custom pipelines. You can start off by cloning a pre-defined project template, adjust it to fit your needs, load in your data, train a pipeline, export it as a Python package, upload your outputs to a remote storage and share your results with your team.
⚠️ spaCy project templates require the new spaCy v3.0, which is currently available as a nightly pre-release. You can install it from pip as
spacy-nightly
:pip install spacy-nightly
. Make sure to use a fresh virtual environment.See the
master
branch for the previous version of this repo.
🗃 Categories
Name | Description |
---|---|
pipelines |
Templates for training NLP pipelines with different components on different corpora. |
tutorials |
Templates that work through a specific NLP use case end-to-end. |
integrations |
Templates showing integrations with third-party libraries and tools for managing your data and experiments, iterating on demos and prototypes and shipping your models into production. |
benchmarks |
Templates to reproduce our benchmarks and produce quantifiable results that are easy to compare against other systems or versions of spaCy. |
experimental |
Experimental workflows and other cutting-edge stuff to use at your own risk. |
🚀 Quickstart
Projects can be used via the new
spacy project
CLI. To find out
more about a command, add --help
. For detailed instructions, see the
usage guide.
- Clone the project template you want to use.
python -m spacy project clone tutorials/ner_fashion_brands
- Fetch assets (data, weights) defined in the
project.yml
.cd ner_fashion_brands python -m spacy project assets
- Run a command defined in the
project.yml
.python -m spacy project run preprocess
- Run a workflow of multiple steps in order.
python -m spacy project run all
- Adjust the template for your specific use case, load in your own data, adjust the settings and model and share the result with your team.
👷♀️Repository maintanance
To keep the project templates and their documentation up to date, this repo contains several scripts:
Script | Description |
---|---|
update_docs.py |
Update all auto-generated docs in the given root. Calls into spacy project document and only replaces the auto-generated sections, not any custom content before or after. |
update_category_docs.py |
Update the auto-generated README.md in the category directories listing the available project templates. |
update_configs.py |
Update and auto-fill all config.cfg files included in the repo, similar to spacy init fill-config . Can be used to keep the configs up to date with changes in spaCy. |