Private Local AI Chat Tool with Document Integration

Reverse engineered prompt

Build me a private GPT tool that lets people chat with local AI models and their own documents without sending data to cloud AI services. It should connect to a local OpenAI compatible model server, like Ollama or llama.cpp, instead of running the model itself.

I want a clean API that other apps can use, similar to the Claude messages API, with normal chat, streaming replies, model selection, async jobs, token counting, file uploads, document ingestion, search over uploaded files, and citations in answers.

Also include a simple browser workbench at /ui for demos. In the UI I should be able to send messages, pick a model, upload PDFs or documents, turn tools on for a chat, inspect the request and response, and test retrieval. Support useful private tools like web search, fetching a web page, code execution, custom tools, MCP connectors, and querying CSVs or databases when configured.

Make it runnable locally with clear setup steps and environment variables for the model and embedding endpoints. Look up current docs online if needed.

Want more depth? Deep Reverse

zylon-ai/private-gpt — reverse-engineered prompt

Reverse engineered prompt