-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Decreasing recall when increasing threads in ANN benchmark #208
Comments
I could reproduce this using With On the same hardware as above, with cuvs 24.10 head, running the cpp benchmark directly I get the following output. Recall drops at 16 threads, but even recall change from
|
This one is subtle. After some experimentation I believe the problem is specifically with the
The problem also seems to be in the Memory allocations and access patterns seems to be ok as well (although the hashmap seems to be allocated by With all that in mind, I believe the only difference between the benchmark cases is the relative order of execution among the CTAs working on the same query. You see, the only way CTAs communicate with each other in
(that is the same dataset and search settings as above, but with stable lower recall) What we can do about this? |
I also tried to modify the random generation logic in a few different ways and to change the |
Describe the bug
The
throughput
mode in ANN benchmark supposes to increase the QPS for small batch queries without impacting the recall level. However, I found that increasing the number of threads inthroughput
mode decreases the achieved recall. The recall decrease is not huge but noticeable.Steps/Code to reproduce bug
I also mounted volumes for input data and config file when doing
docker run
.By using
throughput
mode, ANN bench would shmoo search threads by default. The default is power of twos between min=1 and max=<num hyperthreads>
I am using
wiki-1M
dataset with 768 dim. Here is my configuration file for CAGRAAnd I got the following results, and you can see the decreasing recall there. I also tried adding
--benchmark_min_time=10000x
to ensure each thread runs 10k iterations (total number of queries), but it didn't fix the issue.Expected behavior
The recall level should not decrease.
Environment details (please complete the following information):
docker pull
&docker run
commands used: provided aboveThe text was updated successfully, but these errors were encountered: