PDF Document Question-Answering Streamlit Application

Reverse engineered prompt

Build me a Python Streamlit app for asking questions about a folder of PDF documents. I want to drop PDFs into a data folder, have the app read and index them, then let me chat with the documents and get answers with page level citations.

The most important thing is that it should not make things up. If the documents do not clearly contain the answer, it should say “I don’t know based on the provided documents” instead of guessing. Please make retrieval strong for both normal wording and exact details like numbers, names, tables, and version labels. It should use a vector database plus keyword search, rerank the best chunks, score confidence, and verify that each answer sentence has real support from the sources.

Include simple setup with an example env file for API keys, a clean Streamlit interface, caching so repeated questions are faster, and basic logging or metrics so I can see what sources were used. Use Groq for the language model and look up current docs online if needed.

Want more depth? Deep Reverse

nilimsankar123/Advanced_RAG_Pipeline — reverse-engineered prompt

Reverse engineered prompt