666DZY666/micronet — reverse-engineered prompt
Reverse engineered prompt
Build me a Python library called micronet for compressing and deploying deep learning models, mainly for PyTorch image models. I want it to help take a trained model and make it smaller and faster with quantization and pruning, then run it through TensorRT for inference in fp32, fp16, and int8 with calibration and dynamic shape support.
Please include the main quantization paths that are mentioned here, like quantization aware training, post training quantization, QAFT style fine tuning, and also very low bit options like binary and ternary weights and activations. It should also support normal pruning, regular pruning, group convolution pruning, and batch normalization fusion for quantized models.
Make it feel like a usable package, with install support, clear example scripts, and simple demos on small models like NIN and ResNet so someone can test the flows end to end. If there are tricky details around TensorRT or ONNX, look up current docs online if you need to, but keep the repo practical and runnable.
Want more depth? Deep Reverse