Modular RAG System Project: URAG V2

Reverse engineered prompt

Build me a Python project for a modular RAG system called URAG V2. I want it to take documents, CSV context rows, or existing FAQ files, turn them into searchable knowledge, and store everything in Milvus with separate document and FAQ collections.

The flow should be easy to run from example scripts. One script should embed data into the database, and another should run inference by searching the indexed knowledge. The document side should load files, split them semantically, optionally improve the chunks with an LLM, embed them, and index them. The FAQ side should generate FAQs from the document chunks, create paraphrased versions of the questions while keeping answers correct, embed them, and index them too.

Please include manager classes that coordinate the full workflow, plus a central LLM kernel where I can choose Gemini, OpenAI, or Ollama using environment variables. Use uv for setup and Docker Compose for Milvus. Add clear config examples and demo data so I can run it locally. Look up current docs online if you need to.

Want more depth? Deep Reverse

waanney/URAG_V2 — reverse-engineered prompt

Reverse engineered prompt