-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-128657: fix _hashopenssl ref/data race #128886
base: main
Are you sure you want to change the base?
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
@@ -0,0 +1 @@ | |||
Fix a possible data and PY_EVP_MD refcount race in _hashopenssl.c py_digest_by_name() under free-threading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we try to avoid technical details in the changelog. How about something like this?
Fix a possible data and PY_EVP_MD refcount race in _hashopenssl.c py_digest_by_name() under free-threading. | |
Fix a crash when using objects returned by :func:`hashlib.sha256` under :term:`free threading`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (though not sure how to state this less technically), but just to be clear there is no crash. All this patch does is prevent a possible extra erroneous refcount on PY_EVP_MD
under free threading (and shuts up TSAN).
Modules/_hashopenssl.c
Outdated
@@ -26,6 +26,7 @@ | |||
#include "Python.h" | |||
#include "pycore_hashtable.h" | |||
#include "pycore_strhex.h" // _Py_strhex() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nitpick: let's align this comment with the new one below it.
The implementation LGTM. You can reuse the reproducer in the issue as a test. See the devguide for writing tests. |
Are TSAN tests possible? Because that is the only way the issue shows up, other than as an extra refcount at a specific byte offset of the opaque |
Yeah, tests are run with TSan as part of CI. For example, on this PR: https://github.com/python/cpython/actions/runs/12809851053/job/35715616858?pr=128886 |
Still not sure how you want me to implement this test. |
You don't need to do anything special--just write the test. Python will be built with TSan integration, and races on that test will show up in CI. |
Done. Had to add |
I don't have much time today and tomorrow for in-depth rewview and there is some other stuff to do before reviewing this one, but I'll do it on Sunday or Monday. If I forget, just ping me back (or you can DM me on Discord @ZeroIntensity) |
Lib/test/test_hashlib.py
Outdated
@@ -1196,6 +1196,29 @@ def test_file_digest(self): | |||
with open(os_helper.TESTFN, "wb") as f: | |||
hashlib.file_digest(f, "sha256") | |||
|
|||
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") | |||
def test_gh_128657(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test that doesn't assert anything always looks suspicious. I think you're hoping for this to trigger TSAN issues and thus this isn't really a pass/fail on its own. Just summarize that kind of thing in a comment within the test for later people who look at this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure the test should be there. If the criteria is that any fixed tsan issue should get a test then pretty soon the tsan section will be full of tests that don't really do anything except run for longer and longer times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to (technically) assert something--the thread shouldn't raise exceptions, and that's what the test is for if TSan is disabled.
Lib/test/test_hashlib.py
Outdated
@@ -1196,6 +1196,29 @@ def test_file_digest(self): | |||
with open(os_helper.TESTFN, "wb") as f: | |||
hashlib.file_digest(f, "sha256") | |||
|
|||
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- IIRC, TSan is run for the default build as well, because things like subinterpreters can have thread safety issues.
- I think it's worth always running this for FT, because we'll need to assert that the thread doesn't raise exceptions (more on that in another comment).
- Should only run if threading actually works.
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") | |
@unittest.skipUnless(support.Py_GIL_DISABLED, "only meaningful on free-threading") | |
@threading_helper.requires_working_threading() |
Lib/test/test_hashlib.py
Outdated
thrds = [] | ||
|
||
for i in range(num_workers): | ||
thrds.append(thrd := threading.Thread(target=closure, args=(barrier,))) | ||
thrd.start() | ||
|
||
for thrd in thrds: | ||
thrd.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a utility for starting threads that will simplify this, and we also need to make sure that the thread doesn't raise an exception (threads that fail will not propagate an error to the main thread).
thrds = [] | |
for i in range(num_workers): | |
thrds.append(thrd := threading.Thread(target=closure, args=(barrier,))) | |
thrd.start() | |
for thrd in thrds: | |
thrd.join() | |
with threading_helper.catch_threading_exception() as cm: | |
thrds = [threading.Thread(target=closure, args=(barrier,))] | |
with threading_helper.start_threads(thrds): | |
pass | |
if cm.exc_value: | |
raise cm.exc_value |
Lib/test/test_hashlib.py
Outdated
@@ -1196,6 +1196,29 @@ def test_file_digest(self): | |||
with open(os_helper.TESTFN, "wb") as f: | |||
hashlib.file_digest(f, "sha256") | |||
|
|||
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") | |||
def test_gh_128657(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to (technically) assert something--the thread shouldn't raise exceptions, and that's what the test is for if TSan is disabled.
Lib/test/test_hashlib.py
Outdated
test_hashlib_sha256() | ||
|
||
num_workers = 40 | ||
num_runs = 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need num_runs
. I haven't used or seen it in any other free-threading PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review I forgot to send. I'll have more time starting on Monday to do the in-depth review.
Lib/test/test_hashlib.py
Outdated
@@ -1196,6 +1196,29 @@ def test_file_digest(self): | |||
with open(os_helper.TESTFN, "wb") as f: | |||
hashlib.file_digest(f, "sha256") | |||
|
|||
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") | |
@unittest.skipUnless(support.check_sanitizer(thread=True), | |
"only meaningful on free-threading") |
Lib/test/test_hashlib.py
Outdated
@@ -1196,6 +1196,29 @@ def test_file_digest(self): | |||
with open(os_helper.TESTFN, "wb") as f: | |||
hashlib.file_digest(f, "sha256") | |||
|
|||
@unittest.skipUnless(support.check_sanitizer(thread=True), "only meaningful on free-threading") | |||
def test_gh_128657(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a better test case name please as well as a small comment (not docstring) to explain what this test is meant to assert (and then the reference to the GH issue)? this does not help to know why this test is here unless we open GH.
Lib/test/test_hashlib.py
Outdated
thrds = [] | ||
|
||
for i in range(num_workers): | ||
thrds.append(thrd := threading.Thread(target=closure, args=(barrier,))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thrds.append(thrd := threading.Thread(target=closure, args=(barrier,))) | |
thread = threading.Thread(target=closure, args=(barrier,)) | |
threads.append(thread) |
Lib/test/test_hashlib.py
Outdated
|
||
for i in range(num_runs): | ||
barrier = threading.Barrier(num_workers) | ||
thrds = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thrds = [] | |
threads = [] |
Lib/test/test_hashlib.py
Outdated
thrd.start() | ||
|
||
for thrd in thrds: | ||
thrd.join() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thrd.start() | |
for thrd in thrds: | |
thrd.join() | |
thread.start() | |
for thread in threads: | |
thread.join() |
I would prefer full names. Or if you want short names, you can use t
for the thread variable (but I would prefer to keep threads
).
Sent up requested changes, but I want to be clear, with or without the tsan fix, this test doesn't actually do anything useful (I have no tsan output or test failure when fix is removed).
|
Fix a possible data and PY_EVP_MD refcount race in _hashopenssl.c in
py_digest_by_name()
under free-threading.