Self-Hosted OpenAI-Compatible AI Gateway

Reverse engineered prompt

Build me a self hosted AI gateway that gives my app one simple way to talk to lots of model providers, using an OpenAI style API so I can switch between OpenAI, Anthropic, Gemini, Bedrock, Azure, Vertex, Cohere, Hugging Face, and similar services without rewriting everything.

I want it to work both as a Python library and as a proxy server my team can point tools at. It should support the common things people use, like chat completions, responses, embeddings, images, audio, and batch style requests when available. Please include the practical team features too, like virtual API keys, usage and cost tracking, logging, guardrails, rate limits, load balancing, basic admin controls, and a simple web dashboard to see what is happening.

Make it feel production ready, easy to run locally with Docker, and easy to deploy to a cloud host. Good docs and example configs would help a lot, especially for making the first request and adding multiple providers. If anything is unclear, look up the current docs online and make sensible choices.

Want more depth? Deep Reverse

BerriAi/litellm — reverse-engineered prompt

Reverse engineered prompt