EmmaScharfmann/scientists-inventors — reverse-engineered prompt
Reverse engineered prompt
I want this repo turned into a reproducible workflow that builds the scientist inventor dataset from OpenAlex and PatentsView, then recreates the main validation checks, novelty measures, descriptive charts, tables, and regression outputs from the paper. Please wire it up so someone can follow the steps in order, set the Postgres login details in one place, and run each stage without guessing what depends on what.
Use the provided database schema and make sure the full flow covers downloading or unpacking the source data, encoding paper and patent text, training the matching model, linking scientists to inventors, validating the matches, and generating the final analysis outputs. If the OpenAlex download code is outdated, use the 2023 snapshot path the README mentions and make that the expected route. Please add simple setup and run instructions, note any heavy storage requirements, and make the notebooks or scripts easy to rerun from a clean machine. You can check current docs online if needed.
Want more depth? Deep Reverse