adithya-s-k/omniparse — reverse-engineered prompt
Reverse engineered prompt
Build me a local Python app that can take messy files and web pages and turn them into clean structured markdown that I can feed into AI tools later.
I want to upload PDFs, Word docs, PowerPoints, images, audio, and videos, or paste a website URL. The app should extract text, tables, image captions, OCR from images, audio and video transcripts, and crawled web page content. Keep it fully local with no outside APIs. It should have a simple interactive web UI for testing, plus an API server so other apps can send files to it.
Please make it Linux friendly, runnable with Docker, and able to start with options for documents, media, and web parsing so I don’t have to load everything every time. Include clear setup commands, model download commands, and examples for calling the endpoints. Look up current docs online if you need to.
Want more depth? Deep Reverse