Headroom Local AI Compression Tool

Reverse engineered prompt

Build me a local first tool called Headroom that sits between AI agents or apps and the model, and compresses whatever the model is about to read, like tool output, logs, files, RAG chunks, JSON, code, and conversation history, so people get the same answers with way fewer tokens. I want it to work in three easy ways, as a simple Python and TypeScript library, as a local proxy you can point apps at with basically zero code changes, and as a wrapper for common coding agents like Claude, Cursor, Codex, and similar tools. Please also include an MCP server with compression, retrieval, and stats tools, a reversible local cache so originals can be fetched on demand, shared cross agent memory with deduping, and a learn command that can mine failed sessions and write corrections into agent instruction files. If possible, also trim overly wordy model replies. Keep it fast, local, and privacy friendly, and look up current docs online if you need to.

Want more depth? Deep Reverse

headroomlabs-ai/headroom — reverse-engineered prompt

Reverse engineered prompt