cobusgreyling/NVIDIA-Nemotron-3-Super

Reverse engineered prompt

Build me a clean demo project for NVIDIA Nemotron 3 Super that shows why controllable reasoning is useful. I want a simple web chat where I can enter my NVIDIA API key, ask questions, and switch between full reasoning, low effort, and reasoning off. Show the answer streaming live, show timing, and make it clear when the model is thinking versus giving the final answer.

Also add a small tool calling demo where the model can choose a local math tool, run it, and then answer from the result. Include a budget sweep page or script so I can test the same prompt with different reasoning budgets and compare speed, answer length, and token usage. A basic API server and command line version would be great too, so the same features work outside the web UI.

Please make it easy to run locally with an env file, include clear setup instructions, and look up current NVIDIA NIM docs online if you need to.

Want more depth? Deep Reverse

cobusgreyling/NVIDIA-Nemotron-3-Super — reverse-engineered prompt

Reverse engineered prompt