Fast LLM Proxy Gateway for Multi-Provider API Integration

Reverse engineered prompt

I want a fast LLM proxy gateway that lets my apps call one local API instead of talking directly to each AI provider. Build the server so it accepts the usual OpenAI style chat and text completion requests, forwards them to configured providers, starts with OpenAI working, and keeps the design ready for Anthropic, Azure OpenAI, Bedrock, Cohere, and Vertex later.

It should read API keys and options from a simple config file, support CORS, timeouts, structured logs, a health check, and Prometheus metrics like request counts, latency, errors, token usage, provider, and model. Add basic fallback between providers if configured.

Include Docker and compose setup, example requests, and clear setup docs so I can run it locally on port 8000 and test /v1/chat/completions, /v1/text/completions, /healthz, and /metrics. Use Rust for the gateway and keep it lightweight and production minded. Look up current provider docs online if needed.

Want more depth? Deep Reverse

riipandi/sorai — reverse-engineered prompt

Reverse engineered prompt