I trained the singing voice clone AI on voice clips of the Sentry Bot from Fallout 4. I used the default training settings meaning 10000 epochs- though with the simplicity of the Sentry Bot's voice that was probably overkill...
Anyway, It works pretty well and the AI holds on to the "audio detail" of the Sentry Bot's voice, and where it messes up it still sounds believable (since the Sentry Bot's voice is already "noisy" and "imprecise" in a way). However, if you want those pitch changes found in the Sentry Bot's voice it needs to be included in the input audio. More on output quality, slow clear speaking is recommended for the input audio, since that's how the Sentry Bot speaks as its voice is hard to understand otherwise.
As per comment recommendations here is a link to a good repo that can run this model: https://github.com/voicepaw/so-vits-svc-fork
You can either install from source or use the pip commands that the README specifies
It has a GUI where you can specify the Weights, its associate config file, and the input audio you want to transform
If you can run Stable Diffusion then this AI should run fine under 5 minutes of input audio, where you need more vram for longer audios (though you can just cut up longer clips)
Here is the source:
Image source: https://www.nexusmods.com/fallout4/mods/56150
1- النموذج المعاد النشر هو فقط لأغراض التعلم والتبادل والمشاركة، حقوق النشر والتفسير النهائي تعود للمبدع الأصلي.
2- إذا رغب المبدع الأصلي للنموذج في استلام نموذجه، يرجى التواصل مع موظفي SeaArt AI عبر القنوات الرسمية للتوثيق والاعتماد. نحن نلتزم بحماية حقوق كل مبدع. انقر للاستلام
