jerryji1993/DNABERT — reverse-engineered prompt

Reverse engineered prompt

I want this repo turned into a working DNABERT toolkit I can actually use on my own DNA sequence data. Please set it up so I can take raw sequences, convert them into the k mer format DNABERT expects, run a pretrained model for prediction, and also fine tune on a simple classification dataset like the sample promoter style example. If the old model download links are outdated, use the current Hugging Face model sources instead.

Please make the basic flow easy to run in a Linux Python environment with an NVIDIA GPU, and keep the sample examples working for pretraining, fine tuning, and any motif or visualization utilities that are already part of the project. I would also like a very clear README refresh with copy paste commands, what input format is expected, where outputs go, and a small end to end example using the sample data so I can confirm everything works.

Look up current docs online if you need to.

Want more depth? Deep Reverse