OpenVoice, a cutting-edge voice cloning technology, has been unveiled, boasting impressive capabilities in tone color cloning, voice style control, and zero-shot cross-lingual voice cloning. Developed by Zengyi Qin from MIT and MyShell, along with collaborators from Tsinghua University and MyShell, OpenVoice can generate speech in multiple languages and accents without requiring either the generated or reference speech language to be present in its training dataset.
Since its integration into the myshell.ai platform in May 2023, OpenVoice has seen tens of millions of uses, contributing to significant user growth. The technology allows for nuanced control over voice styles, including emotion, accent, rhythm, pauses, and intonation, offering a new level of customization in voice generation.
For developers and enthusiasts, a live demo is available, and the community is encouraged to join the Discord channel for further discussion and collaboration. While the online version at myshell.ai provides superior audio quality and efficiency, the OpenVoice implementation is available for non-commercial use under a Creative Commons license, with installation and usage instructions provided for those interested in experimenting with the technology.
The OpenVoice team has outlined a roadmap for the project, which includes the release of inference code, tone color converter model, and multi-style base speaker models, among other updates. Acknowledging the contributions of other projects like TTS, VITS, and VITS2, OpenVoice represents a significant step forward in the realm of voice synthesis and cloning.
Read more at GitHub…