SamurAIGPT/Clip-Anything — reverse-engineered prompt
Reverse engineered prompt
Build me a simple Python app called Clip Anything that can take a video file and a plain English prompt, then find the best moments in the video and export them as a new MP4 clip.
I want to be able to say things like “find the funny moments”, “clip when someone scores”, “get the product reveal”, or “find the emotional reaction shots”. The app should analyze the video visually, listen to the audio, notice speech, music, faces, objects, actions, emotions, and any on screen text if possible. It should then pick the scenes that match my prompt, cut them out cleanly, and save the final edited video.
Please make it easy to run locally, with clear setup steps, a basic interface or command line option, and helpful errors if something is missing. If it can show the matching scenes and a simple score for why they were chosen, that would be great. Look up current docs online if you need to.
Want more depth? Deep Reverse