rednote-hilab/dots.ocr — reverse-engineered prompt

Reverse engineered prompt

GitHub

Build me a simple Python OCR demo using this repo. I want to drop in a scanned page, screenshot, chart, diagram, or document image and have the app read it in many languages, keep the layout as much as possible, and return clean text or markdown that I can copy.

If the model can turn charts or diagrams into SVG, expose that too as an optional output, but don’t fake it if the weights don’t support it. Please wire up the Hugging Face model download or make setup instructions clear, include a small web page for uploading files and seeing results, and add a command line way to process a folder of files.

Make it easy to run locally, preferably with Docker too, and include a short README with install steps, expected GPU or CPU notes, and a few sample commands. Look up the current docs online if you need to.

Want more depth? Deep Reverse