scotteh/php-goose — reverse-engineered prompt
Reverse engineered prompt
Build me a PHP library that can take the URL of a news article or other article page and pull out the useful readable content. I want a simple client I can instantiate and call with a URL, then get back the article title, cleaned main text, meta description, keywords, canonical link, domain, tags, links, embedded YouTube or Vimeo videos, publish date, the best image, and optionally all images.
Please make it feel like a clean Composer package with autoloading, sensible defaults, and a small config array for things like language, image size limits, whether to fetch the best image or all images, and browser timeouts. It should work on PHP 7.1 and up. The extraction should focus on article style pages and try to return the purest main body instead of all the page clutter.
Include a basic test suite and a short README with install and usage examples. If anything is unclear, look up current docs online and fill in the gaps with reasonable choices.
Want more depth? Deep Reverse