-
Notifications
You must be signed in to change notification settings - Fork 257
QAT model drops accuracy after converting with torch.ao.quantization.convert #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @andrewor14 |
Hi @tranngocduvnvp, can you share your prepare and convert flow? Are you using the APIs documented here? https://github.com/pytorch/ao/blob/main/torchao/quantization/qat/README.md#quantize_-api-recommended. torchao QAT is not expected to work with |
Hi @andrewor14, thanks for your feedback! I found the cause of the drop in accuracy when performing the convert function. It came from the fact that in the training loop, at the 3rd epoch I turned off the FakeQuantize feature of some layers while still enabling observer, which caused the scale value to change when converting the weight to int format. But I have another question, my model when training still gives quite bad results, I use another repo about QAT model yolo 4bit using pip install brevitas library and it gives very good results. Can you please show me the reason for the decrease in accuracy when using torch.ao library? The quantization configuration code for my layers is as follows:
`
` Thank you so much !! |
Hello everyone.
I am implementing QAT model yolov8 in 4bit mode for weight and 8bit for activation by setting quant_min, quant_max in config. The model when training and eval gives quite good results, however when I convert using torch.ao.quantization.convert method, the model gives very bad evaluation results. Does anyone know how to solve this problem?
The text was updated successfully, but these errors were encountered: