Arabic Text-to-Speech Voice Cloning Tool in Python

Reverse engineered prompt

Build me a simple Arabic text to speech voice cloning tool in Python. I want to type Modern Standard Arabic text, press generate, and get a natural sounding WAV file back.

It should automatically add tashkeel when the text is plain Arabic, but also let me turn that off when I paste text that already has full diacritics. Let me choose a reference voice from a voices folder, and also upload or add my own short voice clip, around 5 to 15 seconds, so the speech can copy that speaker. Include a speed control so I can make the result slower or faster.

Please make a clean Gradio web interface for normal users, plus a simple command line option for people who want to run one command. Include example voices and sample generation if possible. Use CosyVoice3 with the Arabic LoRA checkpoint, and add a setup script that downloads the needed model files from Hugging Face and verifies them. Look up current docs online if you need to.

Want more depth? Deep Reverse

Ramendan/BayanSynthTTS — reverse-engineered prompt

Reverse engineered prompt