a1k7/collusion-interceptor — reverse-engineered prompt

Reverse engineered prompt

GitHub

Build me a local demo called Collusion Interceptor that watches a group of AI agents talking to each other and catches suspicious hidden coordination even when the messages look harmless.

I want to run one Python command, open a browser dashboard, and see the simulation start automatically. The page should be dark mode and simple, with the agent messages, a live suspicion score chart, the current status, and a clear halt moment when the risk passes a threshold. When it halts, save an audit trace as JSON in a traces folder with the final decision, the score, the threshold, and enough details to replay what happened.

Include a small training script that creates a dummy collusion probe so the demo works on a laptop, even if it is not a real production model yet. Please make the README clean enough that someone can install dependencies, train the probe, run the server, and open localhost without confusion. Look up current docs online if you need to.

Want more depth? Deep Reverse