Speech-to-Speech Desktop App Design

Reverse engineered prompt

Build me a simple desktop app that lets me hold a Record button, speak into my mic, turn that speech into text, then immediately send the text to a text to speech service and play the generated audio through whatever output device I picked.

I want it to feel easy to use, with fields for an API key, a provider chooser, a voice chooser, and input and output device selection. Please support ElevenLabs as the main option, and also support 60db as an alternate provider. For transcription, use Whisper with a faster default model, but let me switch to a better medium model when the computer supports CUDA. If the machine can use CUDA, use it automatically.

Please remember the selected provider, each provider’s API key, and the 60db voice setting between launches. Make sure the ElevenLabs voice list loads from the account, and for 60db just use the configured voice value. If you need anything current, look up the latest docs online.

Want more depth? Deep Reverse

CyR1en/ElevenLabsS4TS — reverse-engineered prompt

Reverse engineered prompt