Apple Silicon Parakeet Speech-to-Text Tool

Reverse engineered prompt

Build me a simple Apple Silicon speech to text tool around Nvidia Parakeet models, so on a Mac I can transcribe audio files quickly and get clean text plus subtitle files. I want a command line app where I can pass one or more audio files, pick a model, choose where outputs go, and save results as txt, srt, vtt, json, or all of them. It should support sentence timestamps, optional word level highlighting in subtitles, and work well on long recordings with chunking and overlap settings.

Also give me a small Python interface that feels easy to use, where I can load a pretrained model, transcribe a file, and read back the full text, sentence timings, and token or word timings. If possible, include the decoding options shown in the docs, sentence splitting controls, cache location support, and a streaming transcription mode for near real time updates. Please make it practical to run locally on Apple Silicon, and look up current docs online if you need to fill in gaps.

Want more depth? Deep Reverse

senstella/parakeet-mlx — reverse-engineered prompt

Reverse engineered prompt