文本转语音(TTS)
PaddleSpeech
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/README_cn.md
百度,支持语音、文字多场景互转,且支持流式。
XTTS(推荐)
https://github.com/coqui-ai/TTS
效果还可以,一个集成工具,实现了多个模型,tacotron2,bark, FastSpeech2等。
体验:https://huggingface.co/spaces/coqui/CoquiTTS
ChatTTS
https://github.com/2noise/ChatTTS
EdgeTTS
https://github.com/rany2/edge-tts
CosyVoice
https://github.com/FunAudioLLM/CosyVoice
Vits
https://github.com/jaywalnut310/vits
SpeechT5
https://github.com/microsoft/SpeechT5
英文还可以。
体验:https://huggingface.co/spaces/Matthijs/speecht5-tts-demo
Bark
https://github.com/suno-ai/bark
英文还可以,中文有老外口音。同样可以用于音乐和声音克隆。
体验:https://huggingface.co/spaces/suno/bark
Real-Time-Voice-Cloning
https://github.com/CorentinJ/Real-Time-Voice-Cloning
TTS-Vue
https://github.com/LokerL/tts-vue
MetaVoiceIO
https://github.com/metavoiceio/metavoice-src
EmotiVoice
https://github.com/netease-youdao/EmotiVoice
多音色带语气情感,网易出品。
文档:https://github.com/netease-youdao/EmotiVoice/blob/main/README.zh.md
小白安装教程:https://github.com/netease-youdao/EmotiVoice/blob/main/README_%E5%B0%8F%E7%99%BD%E5%AE%89%E8%A3%85%E6%95%99%E7%A8%8B.md
声音列表:https://github.com/netease-youdao/EmotiVoice/tree/main/data/youdao/text
EmotiVoice-Plus
https://aiyy.info/emotivoice-plus/
基于EmotiVoice做的一个多人转语音的工具。
MockingBird
https://github.com/babysor/MockingBird
Sambert(推荐)
https://modelscope.cn/models/speech_tts/speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k/summary
支持中文和英语,效果还可以,体验:https://modelscope.cn/studios/damo/personal_tts/summary
GPT-SoVITS
https://github.com/RVC-Boss/GPT-SoVITS
语音转文字(STT)
PaddleSpeech
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/README_cn.md
百度,支持语音、文字多场景互转,且支持流式。
Whisper
https://github.com/openai/whisper
支持多语言。
体验
https://huggingface.co/spaces/innev/whisper-Base
https://huggingface.co/jonatasgrosman/whisper-large-zh-cv11
SenseVoice
https://github.com/FunAudioLLM/SenseVoice
效果比Whisper好。
FastWhisper(推荐)
https://github.com/guillaumekln/faster-whisper
Whisper.cpp
https://github.com/ggerganov/whisper.cpp
加速Whisper
DeepSpeech
https://github.com/mozilla/DeepSpeech
https://github.com/SeanNaren/deepspeech.pytorch
Espnet
https://github.com/espnet/espnet
声音克隆
此处最后MockingBird, Sambert, GPT-SoVITS模型和文本转语音模型有重复。
Bert-vits2(推荐)
https://github.com/fishaudio/Bert-VITS2
Fish-speech
https://github.com/fishaudio/fish-speech
Vits
https://github.com/Plachtaa/VITS-fast-fine-tuning
Retrieval-based-Voice-Conversion-WebUI
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
语音转语音,支持声音克隆,微调,可实时变声,作为变声器。
MockingBird
https://github.com/babysor/MockingBird
Sambert(推荐)
https://modelscope.cn/models/speech_tts/speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k/summary
GPT-SoVITS
https://github.com/RVC-Boss/GPT-SoVITS
教程参考:GPT-SoVITS音频处理及语音克隆
基于GPT-SoVITS的声音克隆教程:https://www.bilibili.com/video/BV1P541117yn/?spm_id_from=333.337.search-card.all.click&vd_source=b97a538c390a5dab96d947934fc1119a
BarkVoiceCloning
基于bark进行改造的声音克隆
https://github.com/KevinWang676/Bark-Voice-Cloning
AI唱歌
So-vits-svc
https://github.com/svc-develop-team/so-vits-svc
已经归档不更新了,最新版本是4.1
So-vits-svc-5.0(推荐)
https://github.com/PlayVoice/so-vits-svc-5.0
音乐生成
Muzic
https://github.com/microsoft/muzic
AudioCraft
https://github.com/facebookresearch/audiocraft
音频调音
SoundTouch
https://github.com/imtaotao/sound-touch
评论