-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QuantLayer to automatically expose underlying quant metadata from proxies #1052
Comments
Hi, I would like to work on this issue if it is possible :) |
All help is welcomed! I would recommend to check the old releases to see what the interface for quant metadata used to look like (but not the implementation).
The idea is that if you instantiate a MX Float weight quantizer, you should be able to do: q_linear.quant_weight_exponent_bit_width() Even though QuantLayer won't have any hardcoded Sorry for repeating myself, and please feel free to ask more questions if needed |
Alright, I'll start by getting familiar with the codebase and the past releases, and then I'll dive into it. Thank you for your guidance and providing a head start! I'll reach out with any questions if needed. |
Hi @Giuseppe5, I apologize for the delay in completing this issue. As I am new to Brevitas and still learning how to contribute to open-source projects, I am taking some time to thoroughly understand the repo. I have been exploring the differences between the previous and current versions of Brevitas. In the past version, specifically in parameter.py, the quant_weight method returned a QuantTensor (reconstructed weights after quantization is applied) along with metadata, and there were method definitions to access this metadata directly. In the current version, in parameter_quant.py, the metadata is accessed directly from the Proxy (as you mentioned). I wanted to confirm that I am on the right track with this understanding. Please correct me if I am mistaken in any way. Since, there are new QuantTensor classes with different arguments compared to the earlier version, I am still exploring how to implement the solution. I plan to contact you soon with proper questions and a solution before submitting a pull request. I would greatly appreciate your guidance. |
Hello! You are on the correct track. The idea in my mind is that the proxy exposes a methods that tells the layer which quant metadata is available, and then the layer generates at runtime the methods necessary to directly access that metadata. The main issue could be caused by the bias, since bias quantization might require an external parameter for quantization (external scale). |
I'd love to help out in the future with this if time permits, once I am well versed with the codebase.
Ah yes, metaprogramming. I will look into some Python docs to familiarize myself with it and best approaches.
Alright 👍 |
Is your feature request related to a problem? Please describe.
In previous releases (0.10 and before), quant layers would expose certain quantization metadata of underlying proxies
This has been removed in 0.11 because of the need to implement new QuantTensors with varying quant metadata fields.
All the info is still available but they are only exposed at proxy level.
Describe the solution you'd like
Given a set of quant metadata exposed by the proxy, the layers should be able to automagically expose the methods associated with the various proxies
The text was updated successfully, but these errors were encountered: