Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race in py_digest_by_name, _hashopenssl.c under free-threading #128657

Open
vfdev-5 opened this issue Jan 9, 2025 · 5 comments
Open

Race in py_digest_by_name, _hashopenssl.c under free-threading #128657

vfdev-5 opened this issue Jan 9, 2025 · 5 comments
Labels
extension-modules C modules in the Modules dir topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@vfdev-5
Copy link

vfdev-5 commented Jan 9, 2025

Bug report

Bug description:

I built cpython (3.13 branch) with free-threading and TSAN. The following python code from time to time reports TSAN warnings:

import hashlib
import concurrent.futures
import threading


def _hash_string(hash_obj, str_var):
  hash_obj.update(str_var.encode("utf-8").strip())


jaxlib_version_str = "abcdefabcdefabcdefabcdefabcdefabcdefabcdefabcdefabcdef"


compression_algorithm = "zstandard"


def test_hashlib_sha256():
    entries = [
        (
            "jax_lib version",
            lambda hash_obj: hash_obj.update(
                bytes(jaxlib_version_str.encode("utf-8"))
            ),
        ),
        (
            "compression",
            lambda hash_obj: _hash_string(hash_obj, compression_algorithm),
        ),
        ("custom_hook", lambda hash_obj: _hash_string(hash_obj, "")),
    ]
    hash_obj = hashlib.sha256()
    for _ in range(20):
        for name, hashfn in entries:
            hashfn(hash_obj)

    for _ in range(20):
        hash_obj.digest().hex()


if __name__ == "__main__":
    num_workers = 40
    num_runs = 100

    barrier = threading.Barrier(num_workers)

    def closure():
        barrier.wait()
        for _ in range(num_runs):
            test_hashlib_sha256()

    with concurrent.futures.ThreadPoolExecutor(max_workers=num_workers) as executor:
        futures = []
        for i in range(num_workers):
            futures.append(executor.submit(closure))
        assert len(list(f.result() for f in futures)) == num_workers

TSAN report extract:

==================
WARNING: ThreadSanitizer: data race (pid=46290)
  Write of size 8 at 0x7fffb40402c0 by thread T5:
    #0 py_digest_by_name /project/cpython/./Modules/_hashopenssl.c:384:28 (_hashlib.cpython-313t-x86_64-linux-gnu.so+0x7ce6) (BuildId: cc517b229181dc934d8851b36d8993a3b5cedfb1)
    #1 py_evp_fromname /project/cpython/./Modules/_hashopenssl.c:914:14 (_hashlib.cpython-313t-x86_64-linux-gnu.so+0x7872) (BuildId: cc517b229181dc934d8851b36d8993a3b5cedfb1)
    #2 _hashlib_openssl_sha256_impl /project/cpython/./Modules/_hashopenssl.c:1083:12 (_hashlib.cpython-313t-x86_64-linux-gnu.so+0x6f75) (BuildId: cc517b229181dc934d8851b36d8993a3b5cedfb1)
    #3 _hashlib_openssl_sha256 /project/cpython/./Modules/clinic/_hashopenssl.c.h:591:20 (_hashlib.cpython-313t-x86_64-linux-gnu.so+0x6f75)
    #4 cfunction_vectorcall_FASTCALL_KEYWORDS /project/cpython/Objects/methodobject.c:441:24 (python3.13t+0x289f20) (BuildId: 0d639d83db92b5fe0f1dbe7fdffbc7405ce29a98)

Full TSAN report

cc @colesbury

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

@vfdev-5 vfdev-5 added the type-bug An unexpected behavior, bug, or error label Jan 9, 2025
@picnixz picnixz added extension-modules C modules in the Modules dir topic-free-threading labels Jan 10, 2025
@tom-pytel
Copy link
Contributor

It appears to be a (somewhat) harmless simultaneous initialization of digest routine by multiple threads. Worst thing that can happen as far as I can see is extra refcount on routine which might prevent it from being freed which is probably not a big issue since it is the same routine for all uses. In any case the fix is simple, it will shut up TSAN and keep refcounts same as in GIL py:

$ git diff main
diff --git a/Modules/_hashopenssl.c b/Modules/_hashopenssl.c
index 082929be3c..d05d7b109b 100644
--- a/Modules/_hashopenssl.c
+++ b/Modules/_hashopenssl.c
@@ -26,6 +26,7 @@
 #include "Python.h"
 #include "pycore_hashtable.h"
 #include "pycore_strhex.h"        // _Py_strhex()
+#include "pycore_pyatomic_ft_wrappers.h" // FT_ATOMIC_LOAD_PTR_RELAXED
 #include "hashlib.h"
 
 /* EVP is the preferred interface to hashing in OpenSSL */
@@ -379,16 +380,28 @@ py_digest_by_name(PyObject *module, const char *name, enum Py_hash_type py_ht)
         case Py_ht_evp:
         case Py_ht_mac:
         case Py_ht_pbkdf2:
-            if (entry->evp == NULL) {
-                entry->evp = PY_EVP_MD_fetch(entry->ossl_name, NULL);
+            digest = FT_ATOMIC_LOAD_PTR_RELAXED(entry->evp);
+            if (digest == NULL) {
+                digest = PY_EVP_MD_fetch(entry->ossl_name, NULL);
+                // exchange and free just in case another thread did same thing at same time
+                PY_EVP_MD *other_digest = _Py_atomic_exchange_ptr(&entry->evp, digest);
+                if (other_digest != NULL) {
+                    assert(other_digest == digest);
+                    PY_EVP_MD_free(other_digest);
+                }
             }
-            digest = entry->evp;
             break;
         case Py_ht_evp_nosecurity:
-            if (entry->evp_nosecurity == NULL) {
-                entry->evp_nosecurity = PY_EVP_MD_fetch(entry->ossl_name, "-fips");
+            digest = FT_ATOMIC_LOAD_PTR_RELAXED(entry->evp_nosecurity);
+            if (digest == NULL) {
+                digest = PY_EVP_MD_fetch(entry->ossl_name, "-fips");
+                // exchange and free just in case another thread did same thing at same time
+                PY_EVP_MD *other_digest = _Py_atomic_exchange_ptr(&entry->evp_nosecurity, digest);
+                if (other_digest != NULL) {
+                    assert(other_digest == digest);
+                    PY_EVP_MD_free(other_digest);
+                }
             }
-            digest = entry->evp_nosecurity;
             break;
         }
         if (digest != NULL) {

I used this as reference for digest refcount https://docs.openssl.org/3.1/man3/EVP_DigestInit/#synopsis, and here is a simpler reproducer:

import hashlib
import threading


if __name__ == "__main__":
    num_workers = 40

    barrier = threading.Barrier(num_workers)

    def closure():
        barrier.wait()
        hashlib.sha256()

    for i in range(num_workers):
        threading.Thread(target=closure).start()

@vstinner
Copy link
Member

cc @picnixz @ZeroIntensity

@ZeroIntensity
Copy link
Member

The approach looks reasonable to me. We might need to be careful with ossl_name too, because OpenSSL is not thread safe. Bénédikt will know more about what algorithms need fixing than I do.

@vstinner
Copy link
Member

+                PY_EVP_MD *other_digest = _Py_atomic_exchange_ptr(&entry->evp, digest);
+                if (other_digest != NULL) {
+                    assert(other_digest == digest);
+                    PY_EVP_MD_free(other_digest);
+                }

Is it possible that other_digest is being used by another thread?

@tom-pytel
Copy link
Contributor

tom-pytel commented Jan 13, 2025

Is it possible that other_digest is being used by another thread?

The documentation states that the individual algorithm implementations are refcounted which implies the same object is returned for the same name, which is what I saw in execution and is also implied by the original code setting once and reuse of entry->evp. The assert(other_digest == digest) is there as a safety but if all this holds up (which it appears to in testing) then the PY_EVP_MD_free(other_digest) simply decrements a refcount on the same object pointed to by digest which is redundantly added by multiple calls to PY_EVP_MD_fetch() in different threads. Should wind up in the end with just one refcount from a single call to PY_EVP_MD_fetch(), the same as if it were protected by the GIL.

EDIT: The above does need one little tweak to avoid a very unlikely refcount escape, but basically this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants