Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arrow bloom filter policy #625

Merged
merged 36 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
90cc686
WIP. Add Arrow BF policy
mhaseeb123 Oct 25, 2024
8c614d0
Add test and benchmark
mhaseeb123 Oct 25, 2024
f4d67d8
Minor improvements to example
mhaseeb123 Oct 25, 2024
a5e625f
Update the example to insert and evaluate both policies in one
mhaseeb123 Oct 25, 2024
7bbb36a
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 25, 2024
d31d147
Convert templated lambda function into normal one
mhaseeb123 Oct 25, 2024
2ff15a5
Merge branch 'fea/impl-arrow-bf-policy' of https://github.com/mhaseeb…
mhaseeb123 Oct 25, 2024
863a508
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 25, 2024
8eb5b3c
Update bloom filter policies structure
mhaseeb123 Oct 26, 2024
708aac3
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 26, 2024
0872cfb
doc updates
mhaseeb123 Oct 26, 2024
a0c7d32
Minor updates
mhaseeb123 Oct 26, 2024
a98225c
Merge branch 'fea/impl-arrow-bf-policy' of https://github.com/mhaseeb…
mhaseeb123 Oct 26, 2024
05b6967
Minor
mhaseeb123 Oct 26, 2024
2f42861
Merge branch 'dev' into fea/impl-arrow-bf-policy
mhaseeb123 Oct 26, 2024
e389a7b
Minor updates
mhaseeb123 Oct 26, 2024
8c536bd
Minor
mhaseeb123 Oct 26, 2024
f61e79c
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 26, 2024
e587a21
Update example link
mhaseeb123 Oct 26, 2024
a4f7013
Merge branch 'fea/impl-arrow-bf-policy' of https://github.com/mhaseeb…
mhaseeb123 Oct 26, 2024
8d54b16
Apply suggestions from code review
mhaseeb123 Oct 28, 2024
44ffc9d
Apply suggestion from code review
mhaseeb123 Oct 28, 2024
db6b8f0
Doxygen fix. Benchmarks
mhaseeb123 Oct 28, 2024
59f9791
Separate the two examples.
mhaseeb123 Oct 28, 2024
26ef2a8
Update README with both example links
mhaseeb123 Oct 28, 2024
40ac01f
Apply suggestions from code review
mhaseeb123 Oct 28, 2024
efceee3
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 28, 2024
701886c
Address review comments
mhaseeb123 Oct 28, 2024
d7f91a3
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Oct 28, 2024
7955cb6
Update README.md
PointKernel Oct 28, 2024
a489d2b
doxygen updates
mhaseeb123 Oct 28, 2024
ee48c4d
Merge branch 'fea/impl-arrow-bf-policy' of https://github.com/mhaseeb…
mhaseeb123 Oct 28, 2024
a927b1a
Remove Arrow policy example and add a @code docstring to demo the policy
mhaseeb123 Oct 29, 2024
6802d79
Merge branch 'dev' into fea/impl-arrow-bf-policy
mhaseeb123 Oct 29, 2024
fd9861f
docstring updates
mhaseeb123 Oct 30, 2024
6f7bf92
Merge branch 'fea/impl-arrow-bf-policy' of https://github.com/mhaseeb…
mhaseeb123 Oct 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,4 +254,7 @@ We plan to add many GPU-accelerated, concurrent data structures to `cuCollection
`cuco::bloom_filter` implements a Blocked Bloom Filter for approximate set membership queries.

#### Examples:
- [Host-bulk APIs](https://github.com/NVIDIA/cuCollections/blob/dev/examples/bloom_filter/host_bulk_example.cu) (see [live example in godbolt](https://godbolt.org/clientstate/eJydVm1v20YM_iuE9mF2I78FCwo4L4CXpJuxwsnstEUxD8HpRFuHynfavdgxgvz38U6SLSduMSwBEptHPnz4kEfpOTJojFDSRMO_niORRsNBHOVMLh1bYjSMuEtZFEdGOc399967uYR3cK2KrRbLzEKLt-G0f_pLDJPP45vxCK7vpvd309HD-G7S9b7B_6PgKA2m4GSKGmyGMCoYp3_VSQyfUXsicNrtQ8s7zKPqbB61zwPKVjlYsS1IZcEZJBhhYCFyBHziWFgQErhaFblgkiNshM1Cqgon0IGvFYhKLCN_RhEFfVs0PYHZHXX_k1lbDHu9zWbTZYF2V-llLy-dTe_j-Pp2MrvtEPVd2CeZk7Kg8R8nNBWebIEVxIyzhPjmbANKA1tqpDOrPPONFlbIZQxGLeyGaQw4qTBWi8TZA_FqnlR_04HkY5KEG81gPJtH8OtoNp7FAefL-OH3u08P8GU0nY4mD-PbGdxNqVmTm7FvFX37AKPJV_hjPLmJAUk6SoVPhfZVEFXhZcW01HCGeEBjoUpapkAuFoJDPUGwVGvUksqCAvVKlLNGJNOAk4uVsMwG25viQqreXM7lT0Ly3KUIF9xx1UtypVaP1HeLustddnXoYzPtjO1x5aTt-sM3RymuKcXjGrlV-rgLPiF3nthjoahp2-NehrqLNGrd1xyEoqYgWwWzkJYmTsjWWom0PZfPVBj0evAbStTMIgz6P_f7ffiG2yANDYNBbYMWC6GNhbNwTjiqMobSPYzH5iSf9a0C6VaPAeWywjw_7mMLGurLvfs76HfPvucrD307FcK5Ly3UMSvYxl-kklZ57U6Jr3FJp7SZwNU3bzhsdu-C0l1Vcc8U8lKBlvIOhweNKp09h1ZNJuyFnXfdjZY_6ia4JMnbcYjookz950G7ysCc17J4DF5UXjPkvOlAgaH-ne_Jvv7aS-5gyoCDoxpgT2Ov3Ph1ozOWL-pdFAK8Y9VulqatmkVcZWr_ULBEqfzKe9IldrltlcRjWLDc4KF2RwNlM_Cg-Q2MqpQ_HeptYzp3O4E2yFooZ_JtNde0p3alUeCDX-ImUy5PoUwHYS1b7bBTKENLcY0Q7gkJ83A_vRw0VaFh9XvcvJEm3hdeN_a_0A1l2oxZoA0cnjR-BaIMrF5dPj_7wj9pwkxXJyYEFlolLKfVTMswZZYB7QPHrSOsuAFToeBTJhJhjb9EXtdXdX-4n171IcWCyvKrVJVMqBUJMSdVQqyoa2tU5bfJoWvGTIbGPz5Tv5J9vcfllDs5ZSWn_I6ci1yRXF5tT_bSmypjqx6vsIxbh8N2pEFNU3Vj_Ri029CrAMvxK2e3zLv4H3lfV9I0_SCvrCs2Ng3YFi4u6HHrh5KetvS5FiGYfd8q-2JvD7GUJK-wNNJUSPCr-oXetPz7C73R6P0LWSTXnA9Oz9yAjlVhy7e1qENAl_zkZPAeOkzz7NKsHt_3odOhzW3pj6UcmHZytkrCK1wukgYm5zwn47p86SID1Su_RS9xfU67-uCctIte_g6__wINDYAL))
- [Host-bulk APIs (Default fingerprinting policy)](https://github.com/NVIDIA/cuCollections/blob/dev/examples/bloom_filter/host_bulk_example.cu) (see [live example in godbolt](https://godbolt.org/clientstate/eJydVmtvGjkU_StXsx8WmuEVbVUJQiSapLtoK5IF2qpaVsjj8TBWBnvqBwRF-e977ZmBgZBqtVRqwL6Pc889vvZzoJnWXAod9P9-Dngc9HthkBGxsmTFgn5AbUyCMNDSKup-d94tBLyDG5nvFF-lBhq0CZfdy99CmHwd345HcHM_fbifjubj-0nb2Xr7z5wyoVkMVsRMgUkZjHJC8U-5E8JXphwQuGx3oeEMFkG5twiaAx9lJy2syQ6ENGA1wzBcQ8IzBuyJstwAF0DlOs84EZTBlpvUpyrjeDjwvQwiI0PQnqBHjr-SuiUQs4fuPqkxeb_T2W63beJht6VadbLCWHc-j2_uJrO7FkLfu30RGTILiv2wXGHh0Q5IjsgoiRBvRrYgFZCVYrhnpEO-VdxwsQpBy8RsiWI-Tsy1UTyy5oi8CifWXzdA-ohA4kYzGM8WAXwczcaz0Mf5Np7_cf9lDt9G0-loMh_fzeB-is2a3I5dq_DXJxhNvsOf48ltCAypw1TsKVeuCoTKHa0sLjicMXYEI5EFLJ0zyhNOoVIQrOSGKYFlQc7UmhdaQ5Cxj5PxNTfE-LVXxflUnYVYiF-4oJmNGVxRS2UnyqRcL7Hvhqk2ten1sY1JldWmQ6UVpu02X23FbIMplhtGjVTnTdgTo9YBW-YSm7Y7b6Wxuwyl1j7FwCU2hZG1X-bCoOK4aGwkj5sL8YyFgVukWLdxHIOw6-Uj22kntiH0ur92u90B7D-dTucKfmeCKWJYuQ3O_nwkkxduw0Pcd9Btvx-UkcbIrjKe64QrbSAlWeLjuWCy3PD0vpFAvErQKjMPXntoG5Xd0uhx6SvzOPCr22xVmygR8M0tszvynKmDBmaXF0I7tgCcBE5eaLL0JkOXfbB3neVk60553auYDDFLiM0MFA12mjwF5Kt3kuv365q7qnJdl_GeaxW-lKm1ift91KCBqys8kR9t9ojAPO-e47fxJFgOU7lyJK6Knru5WMDsY6yFgNoHw_tcTMRZmbtQZ79_pPMaatewRtU5P1v3LpWiG26rHbEVyrYZeo82ZnDfe80yDbFOK_nSWyHvdZdB3QAdvVj2thd1sRRWYh-mcDjaqgIcYBwafF7M5Tz3Ds6wlDOJ40aFIiwzNX_KWiRldu0scRBidxoF8BASkml2zN1ZR1F3PDoptRhlKX9Zpna107efqziFN1xane1KDeGs35eGjnN3EepU2iyGIh34q80oy1q51HixbBj44YHEzB-mw16dFTyq7i7Ur6gJD4VXjf0vcH2ZJiUG8Bbzt7W7RpjwqE6Gizui3N3W9QOhvWOuZEQyvN7wQomJIahzZamxGCushSmjsKeUR9yga8HrSd2fHqbXXTxjOZbl5oUskGArIkSOrHhfXtVWq8oNhmPTlOiUafcEif3MwnrP0yn2dIqSTvEGnUkmkS7HtgM7LI64X2xU8vIXWuNYbGcaVF8qT6yTQbMJnTJgIb9Cu0Xe5H_kPa2kvvSTvKJ5fkA6UeKLBb9XJPhl17dyPTmsnw48xVAVArr48wVfq-4NiK9CdXjUBmJDae_yve3htsxN8eINWhhoSC8ueh-gRRRNh3q9_NCFVgvvLYP_GczB4lZG1pF_Bmc8qsWklGa4uCkerriA9YrH4CWs9vHmONpH7oKXf_y_fwHeCexw))

#### Examples:
PointKernel marked this conversation as resolved.
Show resolved Hide resolved
- [Host-bulk APIs (Arrow fingerprinting policy)](https://github.com/NVIDIA/cuCollections/blob/dev/examples/bloom_filter/arrow_policy_example.cu) (see [live example in godbolt](https://godbolt.org/clientstate/eJydV2tzGjcU_Ssa8qGQAAtOnIzxY0IebZlm7NR2k8mUDtFqtaDJIm21WmOayX_vuVotLJhkOsUfDKv7PPfce7VfW4UsCmV00Rr9-bWlktZo2G1lXM9LPpetUUuUCW91W4UpraDf0eOpZo_Za5OvrZovHGuLDjsaHD3rsssPkzeTMXt9df3-6np8O7m67JOsl3-nhNSFTFipE2mZW0g2zrnAv3DSZR-kpUDYUX_A2iQwbYWzaatz6q2sTcmWfM20cawsJMyogqUqk0zeC5k7pjQTZplnimsh2Uq5hXcV7Phw2KdgxMSOQ55DI8evtCnJuNuETp-Fc_koilarVZ_7sPvGzqOsEi6id5PXby9v3vYQ-kbtD50BWWbl36WySDxeM54jMsFjxJvxFTOW8bmVOHOGIl9Z5ZSed1lhUrfiVno7iSqcVXHpdsCr40T-TQHAxzWAG9-wyc20xV6NbyY3XW_n4-T216s_btnH8fX1-PJ28vaGXV2jWJdvJlQq_PqZjS8_sd8ml2-6TAI6uJL3uaUsEKoiWGVSYXgj5U4YqanCKnIpVKoEqxnE5uZOWo20WC7tUlVcQ5CJt5OppXLc-WcPkvOuoqme6kdKi6xMJDsTpTBRnBmznKHuTtq-KBcXuzJuYcvCRcKU2vXp8MFRIu_gYnYnhTP2sIi8l6KkwGa5QdHWh6UKVFeCav39GJRBUSRf-sfR44oSLz1VubVmFazO5D0HrhJZVBKxVTJlb-QSiDjLnQQuBeEIfqKyXrdHFAdoxCMPBaug2FBvAsEkURQ9UYsgTWTKy8yxyi3jagm2cMfAZQUwUJ5QPiJRirqgtb0tVA3FXfp-InPeFqHXZZ9F-dpkGUCk-n2GXooWpjA_VzlWYYVUP3tzbkFOs8ysKDESR4CSFypbMwHAHPDZSarSWIInovDeY7ngd8r4bOuu3VFAqkojDXRUGDFjimbkFaiRC3TyHPwu4z6QjKqGjnzIRK04iuUwEUl88jwePB08f3oiBvHx85PjY_FMPD9KTxJ6nj578TQSeR4VVkQ5t-CB2yOmePTuaHjSe3f0dLAdhC8xuyS7pcEVak89TP2TbIpOdctLmxvMFoCerfts4kiM5p7STqJT_NAoFmblAViCbrvlcmyFGQeZpCJA7Yxnc4M5s1huuwsWMRCVbt8ZlXSm-itOyA0GI-KhEcB0uZx9keuCZuE5Gw5-GgwGp2zziaLojP0itSTGhmNG8octubxSO9_afcwG_ePTYGmC5rfOh50qWzi24Fnq7ZExEw48yN9xoB846AXPpw81ijIONSugceQz83HgKx326kOAu9dwZCyKKDTm1nk1B_daEjSn9oLIzIuck_fTjepHWlNZFtZZ4GpN5dCt-1bxT8-lzS3lMa9gB2e2zsJwCf5oZo5GB1ryrA7q4jRE82qn80jd79ADQW2dBZOVs6lHvvLY7IaNq244lPegsTsr1D9y5i7oacJHI4xVyZNZgZUsZ_WcaWRzsQXuJucrvT8sfLQ-082k08l-HT1pmlFX3782ePAt-ClcMhphkTh2doa1-qrMvqB8np2eiQ-d11AdKlCIaQRLU80aHxj3ntDYWfBcLZjRaGdVbQvmu6Fds9tfjzYq9VJq01E_lnO0dqfrNfrwQN-HneCGl9RP-cxLgStNldOmABR9Q21knzQbqpLSGzOVws5RbWAbxraWhxs-DHevsC1ZH6utXUfRDZ46P0QtNia7IEncZbAE21XgXZbyrJC72B1U1E3FnWnSsBFS-b2Udt2YUJurES5SWFtlgUVXMQgjfJMaFP1KwEAvs4RV7pi_nTpbyh5WAdb5nWR-wAKY2_fX58MmKhhndJ0tHkDT3SZeF_a_hOvTrLa1lX7x0E0QLUtR7Q1g6kZFF4RmOxReMbcm5hluqLgTJtxx8NyWwpWw1W2YCVbk_ULFykG1wnUv75_fX18MsCdzpEWTJ9xHUIoYkQMVr6vq3BpZ0RTYFV3wYuEvVygDzXXkexhOvYFTBzj1d-BMMwO4CG0KNoxC_7Bd08vfSdu7ZDtQoOaj0LFEg06HRcFgRb-Ku5Xf9H_43c-k-egHfnXn8HgkUuKlA99rEPxjqlt4nm6f7w88K8EKzQb4-W2q8cpJt1zcl-32zbSl74QYHh2XQxyb3FWvra0eTJ2LJ0-GL1iPW7E4L5azFwPW62G7u55fNLgv9TK-jP27bKbihk0hRIaHd9XbJx4gY_2l9a1bn2Nd7ZwDvda3v_zfv3UOROY=))
60 changes: 57 additions & 3 deletions benchmarks/bloom_filter/add_bench.cu
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,9 @@ template <typename Key, typename Hash, typename Word, nvbench::int32_t WordsPerB
void bloom_filter_add(nvbench::state& state,
nvbench::type_list<Key, Hash, Word, nvbench::enum_type<WordsPerBlock>, Dist>)
{
using policy_type = cuco::bloom_filter_policy<rebind_hasher_t<Hash, Key>,
Word,
static_cast<std::uint32_t>(WordsPerBlock)>;
using policy_type = cuco::default_filter_policy<rebind_hasher_t<Hash, Key>,
Word,
static_cast<std::uint32_t>(WordsPerBlock)>;
using filter_type =
cuco::bloom_filter<Key, cuco::extent<size_t>, cuda::thread_scope_device, policy_type>;

Expand Down Expand Up @@ -83,6 +83,51 @@ void bloom_filter_add(nvbench::state& state,
});
}

/**
* @brief A benchmark evaluating `cuco::bloom_filter::add_async` performance with
* `arrow_filter_policy`
*/
template <typename Key, typename Dist>
void arrow_bloom_filter_add(nvbench::state& state, nvbench::type_list<Key, Dist>)
{
using policy_type = cuco::arrow_filter_policy<Key>;
using filter_type =
cuco::bloom_filter<Key, cuco::extent<size_t>, cuda::thread_scope_device, policy_type>;

auto const num_keys = state.get_int64("NumInputs");
auto const filter_size_mb = state.get_int64("FilterSizeMB");

std::size_t const num_sub_filters =
(filter_size_mb * 1024 * 1024) /
(sizeof(typename filter_type::word_type) * filter_type::words_per_block);

if (num_sub_filters > policy_type::max_filter_blocks) {
state.skip("bloom filter with arrow policy should have <= 4194304 blocks"); // skip invalid
// configurations
}

thrust::device_vector<Key> keys(num_keys);

key_generator gen;
gen.generate(dist_from_state<Dist>(state), keys.begin(), keys.end());

state.add_element_count(num_keys);

filter_type filter{num_sub_filters};

state.collect_dram_throughput();
state.collect_l1_hit_rates();
state.collect_l2_hit_rates();
state.collect_loads_efficiency();
state.collect_stores_efficiency();

add_fpr_summary(state, filter);

state.exec([&](nvbench::launch& launch) {
filter.add_async(keys.begin(), keys.end(), {launch.get_stream()});
});
}

NVBENCH_BENCH_TYPES(bloom_filter_add,
NVBENCH_TYPE_AXES(nvbench::type_list<defaults::BF_KEY>,
nvbench::type_list<defaults::BF_HASH>,
Expand Down Expand Up @@ -118,3 +163,12 @@ NVBENCH_BENCH_TYPES(bloom_filter_add,
.set_max_noise(defaults::MAX_NOISE)
.add_int64_axis("NumInputs", {defaults::BF_N})
.add_int64_axis("FilterSizeMB", {defaults::BF_SIZE_MB});

NVBENCH_BENCH_TYPES(arrow_bloom_filter_add,
NVBENCH_TYPE_AXES(nvbench::type_list<defaults::BF_KEY>,
nvbench::type_list<distribution::unique>))
.set_name("arrow_bloom_filter_add_unique_size")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_int64_axis("NumInputs", {defaults::BF_N})
.add_int64_axis("FilterSizeMB", defaults::BF_SIZE_MB_RANGE_CACHE);
67 changes: 63 additions & 4 deletions benchmarks/bloom_filter/contains_bench.cu
PointKernel marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,9 @@ void bloom_filter_contains(
{
// cudaDeviceSetLimit(cudaLimitMaxL2FetchGranularity, 32); // slightly improves peformance if
// filter block fits into a 32B sector
using policy_type = cuco::bloom_filter_policy<rebind_hasher_t<Hash, Key>,
Word,
static_cast<std::uint32_t>(WordsPerBlock)>;
using policy_type = cuco::default_filter_policy<rebind_hasher_t<Hash, Key>,
Word,
static_cast<std::uint32_t>(WordsPerBlock)>;
using filter_type =
cuco::bloom_filter<Key, cuco::extent<size_t>, cuda::thread_scope_device, policy_type>;

Expand Down Expand Up @@ -88,6 +88,56 @@ void bloom_filter_contains(
});
}

/**
* @brief A benchmark evaluating `cuco::bloom_filter::contains_async` performance with
* `arrow_filter_policy`
*/
template <typename Key, typename Dist>
void arrow_bloom_filter_contains(nvbench::state& state, nvbench::type_list<Key, Dist>)
{
// cudaDeviceSetLimit(cudaLimitMaxL2FetchGranularity, 32); // slightly improves peformance if
// filter block fits into a 32B sector
using policy_type = cuco::arrow_filter_policy<Key>;
using filter_type =
cuco::bloom_filter<Key, cuco::extent<size_t>, cuda::thread_scope_device, policy_type>;

auto const num_keys = state.get_int64("NumInputs");
auto const filter_size_mb = state.get_int64("FilterSizeMB");

std::size_t const num_sub_filters =
(filter_size_mb * 1024 * 1024) /
(sizeof(typename filter_type::word_type) * filter_type::words_per_block);

if (num_sub_filters > policy_type::max_filter_blocks) {
state.skip("bloom filter with arrow policy should have <= 4194304 blocks"); // skip invalid
// configurations
}

thrust::device_vector<Key> keys(num_keys);
thrust::device_vector<bool> result(num_keys, false);

key_generator gen;
gen.generate(dist_from_state<Dist>(state), keys.begin(), keys.end());

state.add_element_count(num_keys);

filter_type filter{num_sub_filters};

state.collect_dram_throughput();
state.collect_l1_hit_rates();
state.collect_l2_hit_rates();
state.collect_loads_efficiency();
state.collect_stores_efficiency();

add_fpr_summary(state, filter);

filter.add(keys.begin(), keys.end());

state.exec([&](nvbench::launch& launch) {
filter.contains_async(keys.begin(), keys.end(), result.begin(), {launch.get_stream()});
});
}

NVBENCH_BENCH_TYPES(bloom_filter_contains,
NVBENCH_TYPE_AXES(nvbench::type_list<defaults::BF_KEY>,
nvbench::type_list<defaults::BF_HASH>,
Expand Down Expand Up @@ -122,4 +172,13 @@ NVBENCH_BENCH_TYPES(bloom_filter_contains,
.set_type_axes_names({"Key", "Hash", "Word", "WordsPerBlock", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_int64_axis("NumInputs", {defaults::BF_N})
.add_int64_axis("FilterSizeMB", {defaults::BF_SIZE_MB});
.add_int64_axis("FilterSizeMB", {defaults::BF_SIZE_MB});

NVBENCH_BENCH_TYPES(arrow_bloom_filter_contains,
NVBENCH_TYPE_AXES(nvbench::type_list<defaults::BF_KEY>,
nvbench::type_list<distribution::unique>))
.set_name("arrow_bloom_filter_contains_unique_size")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_int64_axis("NumInputs", {defaults::BF_N})
.add_int64_axis("FilterSizeMB", defaults::BF_SIZE_MB_RANGE_CACHE);
1 change: 1 addition & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,4 @@ ConfigureExample(STATIC_MULTIMAP_HOST_BULK_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/
ConfigureExample(HYPERLOGLOG_HOST_BULK_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/hyperloglog/host_bulk_example.cu")
ConfigureExample(HYPERLOGLOG_DEVICE_REF_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/hyperloglog/device_ref_example.cu")
ConfigureExample(BLOOM_FILTER_HOST_BULK_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/bloom_filter/host_bulk_example.cu")
ConfigureExample(BLOOM_FILTER_ARROW_POLICY_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/bloom_filter/arrow_policy_example.cu")
94 changes: 94 additions & 0 deletions examples/bloom_filter/arrow_policy_example.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <cuco/bloom_filter.cuh>

#include <thrust/count.h>
#include <thrust/device_vector.h>
#include <thrust/execution_policy.h>
#include <thrust/sequence.h>

#include <iostream>

/**
* @file arrow_policy_example.cu
* @brief Demonstrates usage of an arrow-compatible bloom filter
*
* In addition to the default policy aimed at achieving the speed of light
* performance on the device, `cuCollections` offers an `arrow_filter_policy`
* that allows users to easily create a bloom filter that mimics the behavior
* of the bloom filter defined in Apache Arrow:
* https://github.com/apache/arrow/blob/be1dcdb96b030639c0b56955c4c62f9d6b03f473/cpp/src/parquet/bloom_filter.cc#L219-L230.
*
* @note This example is for demonstration purposes only. It is not intended to show the most
* performant way to do the example algorithm.
*/

int main(void)
{
int constexpr num_keys = 10'000; ///< Generate 10'000 keys
int constexpr num_tp = num_keys * 0.5; ///< Insert the first half keys into the filter.
int constexpr num_tn = num_keys - num_tp;
int constexpr sub_filters = 200; ///< 200 sub-filters per bloom filter

// key type for bloom filter
using key_type = int;

// We will use the Arrow filter policy for bloom filter fingerprint generation
using policy_type = cuco::arrow_filter_policy<key_type>;
// Bloom filter type with Arrow filter policy
using filter_type =
cuco::bloom_filter<key_type, cuco::extent<size_t>, cuda::thread_scope_device, policy_type>;

// Spawn a bloom filter with arrow policy and 200 sub-filters.
filter_type filter{sub_filters};

std::cout << "Bulk insert into bloom filter with Arrow fingerprint generation policy: "
<< std::endl;

thrust::device_vector<key_type> keys(num_keys);
thrust::sequence(keys.begin(), keys.end(), 1);

auto tp_begin = keys.begin();
auto tp_end = tp_begin + num_tp;
auto tn_begin = tp_end;
auto tn_end = keys.end();

// Insert the first half of the keys.
filter.add(tp_begin, tp_end);

thrust::device_vector<bool> tp_result(num_tp, false);
thrust::device_vector<bool> tn_result(num_keys - num_tp, false);

// Query the filter for the previously inserted keys.
// This should result in a true-positive rate of TPR=1.
filter.contains(tp_begin, tp_end, tp_result.begin());

// Query the filter for the keys that are not present in the filter.
// Since bloom filters are probalistic data structures, the filter
// exhibits a false-positive rate FPR>0 depending on the number of bits in
// the filter and the number of hashes used per key.
filter.contains(tn_begin, tn_end, tn_result.begin());

float tp_rate =
float(thrust::count(thrust::device, tp_result.begin(), tp_result.end(), true)) / float(num_tp);
float fp_rate =
float(thrust::count(thrust::device, tn_result.begin(), tn_result.end(), true)) / float(num_tn);

std::cout << "TPR=" << tp_rate << " FPR=" << fp_rate << std::endl;

return 0;
}
20 changes: 13 additions & 7 deletions examples/bloom_filter/host_bulk_example.cu
mhaseeb123 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,21 @@

int main(void)
{
// Generate 10'000 keys and insert the first 5'000 into the filter.
int constexpr num_keys = 10'000;
int constexpr num_tp = num_keys * 0.5;
int constexpr num_tn = num_keys - num_tp;
int constexpr num_keys = 10'000; ///< Generate 10'000 keys
int constexpr num_tp = num_keys * 0.5; ///< Insert the first half keys into the filter.
int constexpr num_tn = num_keys - num_tp;
int constexpr sub_filters = 200; ///< 200 sub-filters per bloom filter
sleeepyjack marked this conversation as resolved.
Show resolved Hide resolved

// Spawn a filter with 200 sub-filters.
cuco::bloom_filter<int> filter{200};
// key type for bloom filter
using key_type = int;

thrust::device_vector<int> keys(num_keys);
// Spawn a bloom filter with default policy and 200 sub-filters.
cuco::bloom_filter<key_type> filter{sub_filters};

std::cout << "Bulk insert into bloom filter with default fingerprint generation policy: "
<< std::endl;

thrust::device_vector<key_type> keys(num_keys);
thrust::sequence(keys.begin(), keys.end(), 1);

auto tp_begin = keys.begin();
Expand Down
8 changes: 4 additions & 4 deletions include/cuco/bloom_filter.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

#pragma once

#include <cuco/bloom_filter_policy.cuh>
#include <cuco/bloom_filter_policies.cuh>
#include <cuco/bloom_filter_ref.cuh>
#include <cuco/detail/storage/storage_base.cuh>
#include <cuco/extent.cuh>
Expand Down Expand Up @@ -55,13 +55,13 @@ namespace cuco {
* @tparam Extent Size type that is used to determine the number of blocks in the filter
* @tparam Scope The scope in which operations will be performed by individual threads
* @tparam Policy Type that defines how to generate and store key fingerprints (see
* `cuco/bloom_filter_policy.cuh`)
* `cuco/bloom_filter_policies.cuh`)
* @tparam Allocator Type of allocator used for device-accessible storage
*/
template <class Key,
class Extent = cuco::extent<std::size_t>,
cuda::thread_scope Scope = cuda::thread_scope_device,
class Policy = cuco::bloom_filter_policy<cuco::xxhash_64<Key>, std::uint32_t, 8>,
class Policy = cuco::default_filter_policy<cuco::xxhash_64<Key>, std::uint32_t, 8>,
class Allocator = cuco::cuda_allocator<cuda::std::byte>>
class bloom_filter {
public:
Expand Down Expand Up @@ -109,7 +109,7 @@ class bloom_filter {
*
* @param num_blocks Number of sub-filters or blocks
* @param scope The scope in which operations will be performed
* @param policy Fingerprint generation policy (see `cuco/bloom_filter_policy.cuh`)
* @param policy Fingerprint generation policy (see `cuco/bloom_filter_policies.cuh`)
* @param alloc Allocator used for allocating device-accessible storage
* @param stream CUDA stream used to initialize the filter
*/
Expand Down
Loading