March 10, 2021

1423 words 7 mins read

PaddlePaddle/PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle.270+ models including Image, Text, Audio and Video with Easy Inference & Serving deployment)


repo name	PaddlePaddle/PaddleHub
repo link	https://github.com/PaddlePaddle/PaddleHub
homepage	https://www.paddlepaddle.org.cn/hub
language	Python
size (curr.)	169014 kB
stars (curr.)	4794
created	2018-12-21
license	Apache License 2.0

English | 简体中文

AI Creation Camp·PaddleHub Creative Competition (Hot Recruitment)🔥🔥

Developers are sincerely invited to participate in the first phase of AI Creation Camp·PaddleHub Creative Competition. Realize AI creative projects based on PaddleHub. As long as your ideas are novel, you will have generous prizes to take home (10,000 RMB, Switch game console & fitness ring, mechanical keyboard, Xiaodu headphones, Xiaodu toy bear). Scan the QR code in the picture below to register, or click Register Now to register immediately. The competition time is from March 1, 2021 to March 31, 2021.

Introduction

PaddleHub aims to provide developers with rich, high-quality, and directly usable pre-trained models.
No need for deep learning background, you can use AI models quickly and enjoy the dividends of the artificial intelligence era.
Covers 4 major categories of Image, Text, Audio, and Video, and supports one-click prediction, easy service deployment and transfer learning
All models are OPEN SOURCE, FREE to download and use them in offline scenario.

Recent updates

2021.02.18: The v2.0.0 version is released, making model development and debugging easier, and the finetune task is more flexible and easy to use.The ability to transfer learning for visual tasks is fully upgraded, supporting various tasks such as image classification, image coloring, and style transfer; Transformer models such as BERT, ERNIE, and RoBERTa are upgraded to dynamic graphs, supporting Fine-Tune capabilities for text classification and sequence labeling; Optimize the Serving capability, support multi-card prediction, automatic load balancing, and greatly improve performance; the new automatic data enhancement capability Auto Augment can efficiently search for data enhancement strategy combinations suitable for data sets. 61 new word vector models were added, including 51 Chinese models and 10 English models; add 4 image segmentation models, 2 depth models, 7 image generation models, and 3 text generation models, the total number of pre-trained models reaches 【274】.
2020.12.1: Release 2.0-beta1 version, migrate ERNIE, RoBERTa, BERT to dynamic graph mode. Add text classification fine-tune task based on large-scale pre-trained models.
2020.11.20: Release 2.0-beta version, fully migrate the dynamic graph programming mode, and upgrade the service deployment Serving capability; add 1 hand key point detection model, 12 image cartoonization models, 3 image editing models, 3 speech synthesis models, syntax Analyzing one, the total number of pre-trained models reaches 【182】.
2020.10.09: Added 4 new OCR multi-language series models, 4 image editing models, and the total number of pre-trained models reached 【162】.
2020.09.27: 6 new text generation models and 1 image segmentation model were added, and the total number of pre-trained models reached 【154】.
2020.08.13: Released v1.8.1, added a segmentation model, and supports EMNLP2019-Sentence-BERT as a text matching task network. The total number of pre-training models reaches 【147】.
2020.07.29: Release v1.8.0, new AI couplets and AI writing poems, jieba word segmentation, LDA topic model, semantic similarity calculation, new target detection, short video classification model, ultra-lightweight Chinese and English OCR, new pedestrian detection, vehicle industrial-grade models such as detection and animal recognition support VisualDL visualization training, and the total number of pre-training models reaches 【135】.

Features

Abundant Pre-trained Models: 180+ pre-trained models covering the 4 major categories including Image, Text, Audio, and Video, all open source and free for download and offline usage.
Quick Model Prediction: Model prediction can be realized through a few lines of scripts to quickly experience the model effect.
Model As Service: A one-line command to build deep learning model API service deployment capabilities.
Easy-to-use Transfer Learning: Just few lines of code you can complete the transfer-learning task like image classification and text classification based on high quality pre-trained models.
Cross-platform: Can run on Linux, Windows, MacOS and other operating systems.

Visualization Demo

Text Recognition

Contain ultra-lightweight Chinese and English OCR models, high-precision Chinese and English, multilingual German, French, Japanese, Korean OCR recognition.
Many thanks to CopyRight@PaddleOCR for the pre-trained models, you can try to train your models with PadddleOCR.

Face Detection

Including face detection, mask face detection, multiple algorithms are optional.
Many thanks to CopyRight@PaddleDetection for the pre-trained models, you can try to train your models with PadddleDetection.

Image Editing

4x super resolution effect, multiple super resolution models are optional.
Colorization models can be used to repair old grayscale photos.
Many thanks to CopyRight@PaddleGAN for the pre-trained models, you can try to train your models with PadddleGAN.

Image Generation

Including portrait cartoonization, street scene cartoonization, and style transfer.
Many thanks to CopyRight@PaddleGAN、CopyRight@AnimeGANfor the pre-trained models.

Object Detection

Pedestrian detection, vehicle detection, and more industrial-grade ultra-large-scale pretrained models are provided.
Many thanks to CopyRight@PaddleDetection for the pre-trained models, you can try to train your models with PadddleDetection.

Key Point Detection

Support body, face and hands key point detection for single or multiple person.
Many thanks to CopyRight@openpose for the pre-trained models.

Image Segmentation

High quality pixel-level portrait cutout model, ACE2P human body analysis world champion models are provided, Dynamic Sky Replacement and Harmonization.
Many thanks to CopyRight@PaddleSeg, CopyRight@Zhengxia Zou for the pre-trained models, you can try to retrain your models by paddleseg or sky matting model.

(The second gif comes from CopyRight@jiupinjia/SkyAR)

Image Classification

Various models like animal classification, dish classification, wild animal product classification are available.
Many thanks to CopyRight@PaddleClas for the pre-trained models, you can try to train your models with PadddleClas.

Text Generation

AI poem writing, AI couplets, AI love words generation models are available.
Many thanks to CopyRight@ERNIE for the pre-trained models, you can try to train your models with ERNIE.

Lexical Analysis

Excelent Chinese text segmentation, part-of-speech, named entity recognition model are provided by LAC@Baidu NLP.

Syntactic Analysis

Leading Chinese syntactic analysis model are provided by DDParser@Baidu NLP.

Sentiment Analysis

All SOTA Chinese sentiment analysis model released by Baidu NLP can be used just one-line of code.

Text Review

Text review model of Chinese pornographic text are available.

Speech Synthesis

TTS speech synthesis algorithm, multiple algorithms are available.
Many thanks to CopyRight@Parakeet for the pre-trained models, you can try to train your models with Parakeet.
Input: Life was like a box of chocolates, you never know what you're gonna get.
The synthesis effect is as follows:

Video Classification

Short video classification trained via large-scale video datasets, supports 3000+ tag types prediction for short Form Videos.
Many thanks to CopyRight@PaddleVideo for the pre-trained model, you can try to train your models with PaddleVideo.
Example: Input a short video of swimming, the algorithm can output the result of "swimming"

===Key Points===

All the above pre-trained models are all open source and free, and the number of models is continuously updated. Welcome ⭐Star⭐ to pay attention.

Welcome to join PaddleHub technical group

If you have any questions during the use of the model, you can join the official WeChat group to get more efficient questions and answers, and fully communicate with developers from all walks of life. We look forward to your joining.

Documentation Tutorial

PIP Installation
Quick Start
Rich Pre-trained Models 274
- Boutique Featured Models
- Computer Vision 141
- Natural Language Processing 122
- Audio 3
  - Speech Synthesis 3
- Video 8
  - Video Classification 5
  - Video Repair 3
Deploy
Advanced documentation
- Command Line Interface Usage
- How to Load Customized Dataset
Community
License
Contribution

License

The release of this project is certified by the Apache 2.0 license.

Contribution

We welcome you to contribute code to PaddleHub, and thank you for your feedback.

Many thanks to 肖培楷, Contributed to street scene cartoonization, portrait cartoonization, gesture key point recognition, sky replacement, depth estimation, portrait segmentation and other modules
Many thanks to Austendeng for fixing the SequenceLabelReader
Many thanks to cclauss optimizing travis-ci check
Many thanks to 奇想天外，Contributed a demo of mask detection
Many thanks to mhlwsk，Contributed the repair sequence annotation prediction demo
Many thanks to zbp-xxxp，Contributed modules for viewing pictures and writing poems
Many thanks to zbp-xxxp and 七年期限,Jointly contributed to the Mid-Autumn Festival Special Edition Module
Many thanks to livingbody，Contributed models for style transfer based on PaddleHub’s capabilities and Mid-Autumn Festival WeChat Mini Program
Many thanks to BurrowsWang for fixing Markdown table display problem