Skip to content

Commit

Permalink
Use permalinks for github refs
Browse files Browse the repository at this point in the history
  • Loading branch information
hesampakdaman committed May 16, 2024
1 parent 83d053b commit db5b38e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.org
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
** Introduction
*rust_1brc* is a Rust implementation for the [[https://1brc.dev/][One Billion Requests Challenge (1BRC)]]. The challenge involves processing one billion temperature measurements to calculate the minimum, mean, and maximum temperatures per weather station. This project aims to explore Rust's capabilities for efficient data handling and processing. The main motivation for undertaking this challenge was to leverage Rust's ~std::mpsc~ and ~std::thread~ libraries. The challenge is well-suited for parallelization, making it an ideal choice for exploring these libraries.

The input file can be obtained by using one of the official scripts, such as this [[https://github.com/gunnarmorling/1brc/blob/main/src/main/python/create_measurements.py][Python version]]. This script generates a text file containing one billion temperature measurements, approximately 13GB in size, referred to as =measurements.txt=. The input file uses names from the [[https://github.com/gunnarmorling/1brc/blob/main/data/weather_stations.csv][weather_stations.csv]] allowing us to optimize our hash function specifically for the official dataset. By tailoring the hash function to this specific set, we achieve higher performance compared to Rust's standard library implementation, which is generally more collision-resilient and secure.
The input file can be obtained by using one of the official scripts, such as this [[https://github.com/gunnarmorling/1brc/blob/db064194be375edc02d6dbcd21268ad40f7e2869/src/main/python/create_measurements.py][Python version]]. This script generates a text file containing one billion temperature measurements, approximately 13GB in size, referred to as =measurements.txt=. The input file uses names from the [[https://github.com/gunnarmorling/1brc/blob/db064194be375edc02d6dbcd21268ad40f7e2869/data/weather_stations.csv][weather_stations.csv]] allowing us to optimize our hash function specifically for the official dataset. By tailoring the hash function to this specific set, we achieve higher performance compared to Rust's standard library implementation, which is generally more collision-resilient and secure.

*This project will load the _entire_ file into memory and process it using all available CPU cores to maximize performance.*

Expand Down

0 comments on commit db5b38e

Please sign in to comment.