Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting the correct package version from an auditwheel generated binary #397

Closed
captn3m0 opened this issue Sep 27, 2022 · 6 comments
Closed
Labels

Comments

@captn3m0
Copy link

I'm trying to replicate the issue posted here: vinayak-mehta/pdftopng#12 (along with many similar ones for python security research, and I'm currently stuck at figuring out how to go back from a "patched library inside a wheel" to "source package and version used at the time of build.

Given #389, the hashes can't be used (and even if this is resolved, older wheels will still have this issue).

Any suggestions on what might work here?

@mayeut
Copy link
Member

mayeut commented Nov 19, 2022

the hashes can't be used

I'm not sure why you're saying the hashes in the sonames can't be used.
If someone tries to repair an already repaired wheel, sure, this hash is useless but I think in the vast majority, you get the original hash of the system library that was grafted so you can build a hash / library map that would work 99% of the times ? (I'm not saying that building this map is easy but should be doable)

@captn3m0
Copy link
Author

It's a partial hash in the filename, not very helpful.

@mayeut
Copy link
Member

mayeut commented Nov 19, 2022

you have the library name & the partial hash, not only the partial hash. Do you have any number showing it's unhelpful (i.e. a single library name with different versions having an overlapping partial hash) ?

@captn3m0
Copy link
Author

There is no straightforward path from name+partial-hash that goes back to the package installed at the time of the build. I filed the issue, hoping for any suggestions for alternate pathways.

Even with a partial hash (assuming no conflicts), I need to first create a database of all relevant packages, and their hashes before I can do a lookup.

The closest reliable alternative I have so far is to use NT_GNU_BUILD_ID (via readelf) as the lookup key.

@mayeut
Copy link
Member

mayeut commented Nov 21, 2022

Until #398 is addressed, I have no clue how to do this without creating a database (& this might be required anyway for older packages).
As mentioned earlier, I'm not saying it's easy to build this DB but should be doable - at least partially -, especially on packages built on EOL versions of CentOS (5 - base OS for manylinux1 image - & CentOS 6 - base OS for manylinux2010 image) where I hope you'd get the latest update available there, might get trickier on CentOS 7 - base OS for manylinux2014 image - or more recent distros or for non system libraries being grafted.

@mayeut
Copy link
Member

mayeut commented Nov 27, 2022

I think the question has been answered & I'll close this.
As mentioned in the previous comment, I have no other ideas to share until #398 is addressed.

@mayeut mayeut closed this as completed Nov 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants