snorkel-team/snorkel
A system for quickly generating training data with weak supervision
repo name | snorkel-team/snorkel |
repo link | https://github.com/snorkel-team/snorkel |
homepage | https://snorkel.org |
language | Python |
size (curr.) | 292689 kB |
stars (curr.) | 3681 |
created | 2016-02-26 |
license | Apache License 2.0 |
Programmatically Build and Manage Training Data
Quick Links
- Snorkel website
- Snorkel tutorials
- Snorkel documentation
- Snorkel community forum
- Snorkel mailing list
- Snorkel Twitter
Getting Started
The quickest way to familiarize yourself with the Snorkel library is to walk through the Get Started page on the Snorkel website, followed by the full-length tutorials in the Snorkel tutorials repository. These tutorials demonstrate a variety of tasks, domains, labeling techniques, and integrations that can serve as templates as you apply Snorkel to your own applications.
Installation
Snorkel requires Python 3.6 or later. To install Snorkel, we recommend using pip
:
pip install snorkel
or conda
:
conda install snorkel -c conda-forge
For information on installing from source and contributing to Snorkel, see our contributing guidelines.
The following example commands give some more color on installing with conda
.
These commands assume that your conda
installation is Python 3.6,
and that you want to use a virtual environment called snorkel-env
.
# [OPTIONAL] Activate a virtual environment called "snorkel"
conda create --yes -n snorkel-env python=3.6
conda activate snorkel-env
# We specify PyTorch here to ensure compatibility, but it may not be necessary.
conda install pytorch==1.1.0 -c pytorch
conda install snorkel==0.9.0 -c conda-forge
If you’re using Windows, we highly recommend using Docker (you can find an example in our tutorials repo) or the Linux subsystem. We’ve done limited testing on Windows, so if you want to contribute instructions or improvements, feel free to open a PR!
Discussion
Issues
We use GitHub Issues for posting bugs and feature requests — anything code-related. Just make sure you search for related issues first and use our Issues templates. We may ask for contributions if a prompt fix doesn’t fit into the immediate roadmap of the core development team.
Contributions
We welcome contributions from the Snorkel community! This is likely the fastest way to get a change you’d like to see into the library.
Small contributions can be made directly in a pull request (PR).
If you would like to contribute a larger feature, we recommend first creating an issue with a proposed design for discussion.
For ideas about what to work on, we’ve labeled specific issues as help wanted
.
To set up a development environment for contributing back to Snorkel, see our contributing guidelines. All PRs must pass the continuous integration tests and receive approval from a member of the Snorkel development team before they will be merged.
Community Forum
For broader Q&A, discussions about using Snorkel, tutorial requests, etc., use the Snorkel community forum hosted on Spectrum. We hope this will be a venue for you to interact with other Snorkel users — please don’t be shy about posting!
Announcements
To stay up-to-date on Snorkel-related announcements (e.g. version releases, upcoming workshops), subscribe to the Snorkel mailing list. We promise to respect your inboxes — communication will be sparse!
Follow us on Twitter @SnorkelML.