Enterprise RAG Platform Architecture

Reverse engineered prompt

Build me an enterprise style RAG platform in Python that can ingest documents, index them, and answer user questions through an API. I want it to use an agent driven flow that can decide when to answer directly, when to rewrite the question, and when to do retrieval. Retrieval should combine vector search and knowledge graph search so the answers are better grounded and less likely to hallucinate.

Please make it scalable on AWS with Kubernetes, with a lightweight control plane for the API and orchestration, and a heavier data plane for embeddings and model inference on GPU workers that can scale up when needed. Include the infrastructure and deployment setup, the core services, the ingestion pipeline, and a reasonable local development path if the repo supports it. I also want basic validation so I can confirm the system works end to end after deployment.

Wire everything together so someone can deploy it and start asking questions against their data. Look up current docs online if you need to.

Want more depth? Deep Reverse

FareedKhan-dev/scalable-rag-pipeline — reverse-engineered prompt

Reverse engineered prompt