BenAAndrew/Voice-Cloning-App
A Python/Pytorch app for easily synthesising human voices
repo name | BenAAndrew/Voice-Cloning-App |
repo link | https://github.com/BenAAndrew/Voice-Cloning-App |
homepage | |
language | Jupyter Notebook |
size (curr.) | 1605 kB |
stars (curr.) | 181 |
created | 2021-03-10 |
license | BSD 3-Clause “New” or “Revised” License |
Voice Cloning App
A Python/Pytorch app for easily synthesising human voices
System Requirements
- Windows 10 or Ubuntu 20.04+ operating system
- NVIDIA GPU with at least 4GB of memory
- Up-to-date NVIDIA driver (version 450.36+)
Key features
- Automatic dataset generation
- Easy train start/stop
- Support for kindle & audible as data sources
- Data importing/exporting
- Simplified training & synthesis
- Word replacement suggestion
- Windows & Linux support
Video guide
https://www.youtube.com/playlist?list=PLk5I7EvFL13GjBIDorh5yE1SaPGRG-i2l
Voice Sharing Hub
https://voice-sharing-hub.herokuapp.com/
FAQ’s
Manual Guides
Future Improvements
- Test pretrained weights for transfer learning
- Add support for alternative models
- Improved batch size estimation
- AMD GPU support
- Additional language support
Acknowledgements
This project uses a reworked version of Tacotron2 & Waveglow. All rights for belong to NVIDIA and follow the requirements of their BSD-3 licence.
Thank you to Dr. John Bustard at Queen’s University Belfast for his support throughout the project.
Also a big thanks to the members of the VocalSynthesis subreddit for their feedback.
Finally thank you to everyone raising issues and contributing to the project.