OJALAB/job-ads-classifier — reverse-engineered prompt
Reverse engineered prompt
Build me a Python command line tool for classifying job advertisements into hierarchical job categories. I want to be able to train a model from plain text job ad files, label files, and a hierarchy table, then save the trained model and use it later to predict categories for new job ads.
Please support two choices, a faster classic text model using TF IDF, and a transformer based model using Hugging Face style encoder models. It should work on normal CPU machines, and use GPU safely when the right setup is available. Keep the old main.py style entry point, but also make it installable as a proper package with a console command.
Include clear setup instructions, small example data, simple fit and predict examples, tests, and a small benchmark script for comparing transformer tokenization modes. Make sure saved models can still be loaded, predictions are written to a file, and the transformer path can process texts efficiently in batches. Look up current docs online if you need to.
Want more depth? Deep Reverse