Python Web Browsing Agent with OpenAI Integration

Reverse engineered prompt

Build me a Python web browsing agent that can open a real browser, look at annotated screenshots of the page, and use an OpenAI model to decide what to do next. I want to be able to give it a simple task like “find me the latest news on CNN” and have it navigate from Google, click links, type into search boxes, scroll, wait, and go back when needed.

It should show the browser while it works, draw boxes around clickable or usable parts of the page, and print a clear log of the actions it takes. If my OpenAI API key isn’t set, ask me for it. Please include setup instructions, requirements, and a simple way to run it with python main.py.

Use realistic mouse and keyboard behavior where possible, including human like typing delays. Keep the project simple and readable so I can customize the prompt, tools, and typing speed later.

Want more depth? Deep Reverse

DonGuillotine/langraph_browser_agent — reverse-engineered prompt

Reverse engineered prompt