June 21, 2020

1062 words 5 mins read

trekhleb/machine-learning-experiments

trekhleb/machine-learning-experiments

Interactive Machine Learning experiments: models training + models demo

repo name trekhleb/machine-learning-experiments
repo link https://github.com/trekhleb/machine-learning-experiments
homepage https://trekhleb.github.io/machine-learning-experiments/
language Jupyter Notebook
size (curr.) 798090 kB
stars (curr.) 498
created 2019-11-14
license MIT License

🤖 Interactive Machine Learning Experiments

This is a collection of interactive machine-learning experiments. Each experiment consists of 🏋️ Jupyter/Colab notebook (to see how a model was trained) and 🎨 demo page (to see a model in action right in your browser).

⚠️ This repository contains machine learning experiments and not a production ready, reusable, optimised and fine-tuned code and models. This is rather a sandbox or a playground for learning and trying different machine learning approaches, algorithms and data-sets. Models might not perform well and there is a place for overfitting/underfitting.

Experiments

Most of the models in these experiments were trained using TensorFlow 2 with Keras support.

Supervised Machine Learning

Supervised learning is when you have input variables X and an output variable Y and you use an algorithm to learn the mapping function from the input to the output: Y = f(X). The goal is to approximate the mapping function so well that when you have new input data X that you can predict the output variables Y for that data. It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process.

Multilayer Perceptron (MLP) or simple Neural Network (NN)

A multilayer perceptron (MLP) is a class of feedforward artificial neural network (ANN). Multilayer perceptrons are sometimes referred to as “vanilla” neural networks (composed of multiple layers of perceptrons), especially when they have a single hidden layer. It can distinguish data that is not linearly separable.

Convolutional Neural Networks (CNN)

A convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to analyzing visual imagery (photos, videos). They are used for detecting and classifying objects on photos and videos, style transfer, face recognition, pose estimation etc.

Recurrent Neural Networks (RNN)

A recurrent neural network (RNN) is a class of deep neural networks, most commonly applied to sequence-based data like speech, voice, text or music. They are used for machine translation, speech recognition, voice synthesis etc.

Unsupervised Machine Learning

Unsupervised learning is when you only have input data X and no corresponding output variables. The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data. These are called unsupervised learning because unlike supervised learning above there is no correct answers and there is no teacher. Algorithms are left to their own to discover and present the interesting structure in the data.

Generative Adversarial Networks (GANs)

A generative adversarial network (GAN) is a class of machine learning frameworks where two neural networks contest with each other in a game. Two models are trained simultaneously by an adversarial process. For example a generator (“the artist”) learns to create images that look real, while a discriminator (“the art critic”) learns to tell real images apart from fakes.

How to use this repository locally

Setup virtual environment for Experiments

# Create "experiments" environment (from the project root folder).
python3 -m venv .virtualenvs/experiments

# Activate environment.
source .virtualenvs/experiments/bin/activate
# or if you use Fish...
source .virtualenvs/experiments/bin/activate.fish

To quit an environment run deactivate.

Install dependencies

# Upgrade pip and setuptools to the latest versions.
pip install --upgrade pip setuptools

# Install packages
pip install -r requirements.txt

To install new packages run pip install package-name. To add new packages to the requirements run pip freeze > requirements.txt.

Launch Jupyter locally

In order to play around with Jupyter notebooks and see how models were trained you need to launch a Jupyter Notebook server.

# Launch Jupyter server.
jupyter notebook

Jupyter will be available locally at http://localhost:8888/. Notebooks with experiments may be found in experiments folder.

Launch demos locally

Demo application is made on React by means of create-react-app.

# Switch to demos folder from project root.
cd demos

# Install all dependencies.
yarn install

# Start demo server on http. 
yarn start

# Or start demo server on https (for camera access in browser to work on localhost).
yarn start-https

Demos will be available locally at http://localhost:3000/ or at https://localhost:3000/.

Convert models

The converter environment is used to convert the models that were trained during the experiments from .h5 Keras format to Javascript understandable formats (tfjs_layers_model or tfjs_graph_model formats with .json and .bin files) for further usage with TensorFlow.js in Demo application.

# Create "converter" environment (from the project root folder).
python3 -m venv .virtualenvs/converter

# Activate "converter" environment.
source .virtualenvs/converter/bin/activate
# or if you use Fish...
source .virtualenvs/converter/bin/activate.fish

# Install converter requirements.
pip install -r requirements.converter.txt

The conversion of keras models to tfjs_layers_model/tfjs_graph_model formats is done by tfjs-converter:

For example:

tensorflowjs_converter --input_format keras \
  ./experiments/digits_recognition_mlp/digits_recognition_mlp.h5 \
  ./demos/public/models/digits_recognition_mlp

⚠️ Converting the models to JS understandable formats and loading them to the browser directly might not be a good practice since in this case the user might need to load tens or hundreds of megabytes of data to the browser which is not efficient. Normally the model is being served from the back-end (i.e. TensorFlow Extended) and instead of loading it all to the browser the user will do a lightweight HTTP request to do a prediction. But since the Demo App is just an experiment and not a production-ready app and for the sake of simplicity (to avoid having an up and running back-end) we’re converting the models to JS understandable formats and loading them directly into the browser.

Requirements

Recommended versions:

  • Python: > 3.7.3.
  • Node: >= 12.4.0.
  • Yarn: >= 1.13.0.

In case if you have Python version 3.7.3 you might experience RuntimeError: dictionary changed size during iteration error when trying to import tensorflow (see the issue).

You might also be interested in

  • Homemade Machine Learning - Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained.
  • NanoNeuron - 7 simple JavaScript functions that will give you a feeling of how machines can actually “learn”.
  • Playground and Cheatsheet for Learning Python - Collection of Python scripts that are split by topics and contain code examples with explanations.

Articles

comments powered by Disqus