Python CNKI Scraper for Article Records Collection

Reverse engineered prompt

Build me a simple Python CNKI scraper like this old project describes. I want to set a search keyword and basic search options in one script, run it, and have it collect article records from 中国知网 into a data folder.

The output should be a timestamped text file named with the keyword and time. The first line should be the field names, and each article should be saved on its own line with a clear separator so I can rename it to CSV and open it in LibreOffice or Excel. Please make sure the text is UTF 8 so Chinese characters don’t get messy.

If the crawl stops halfway, I want to be able to set a start page and continue from there. Also include simple settings for resting between requests so it doesn’t fail too easily. Keep it easy to run from the command line, with comments in the code explaining where to change keyword, page range, separator, and delays.

Want more depth? Deep Reverse

yanzhou/CnkiSpider — reverse-engineered prompt

Reverse engineered prompt