Emul4nt/byepy-benchmark — reverse-engineered prompt

Reverse engineered prompt

GitHub

Build me a self contained Python experiment that tests an obfuscator against a fresh deobfuscator over multiple rounds.

I want a small corpus of Python programs split into easy, medium, and hard examples, then a runner that obfuscates them, asks a new recovery pass to try to clean them up, and judges the results. The judge should check that the recovered code behaves the same as the original, compare the Python structure with an AST similarity score, and optionally include a simple readability score.

Please include the obfuscator, the round runner, the judging script, sample corpus files, and saved reports for each round. The obfuscation should get harder over time, with things like renamed variables, encoded strings, dead code, flattened control flow, bootstrap loaders, dispatch tables, and integer constant tables. The recovery side should have editable helper scripts so each round can improve from the previous one.

Make it easy to run from the command line and inspect each round’s obfuscated files, recovered files, and objective report.

Want more depth? Deep Reverse