The following data sets contain the raw performance data and metadata of the associated blog post NoSQL Benchmark: MongoDB vs ScyllaDB on comparing the performance of MongoDB Atlas and ScyllaDB Cloud. For more details on the method and the benchmark objectives please refer to the blog post.
The performance comparison is based on the following three Yahoo Cloud Serving Benchmark (YCSB) workloads:
- caching: read-update workload
- social: read-heavy workload
- sensor: write-heavy workload
Each folder contains the data of a single run benchmark of one configuration. Each configuration is measured three times.
In order to ensure full transparency and reproducibility, each data folder contains benchmark configuration data, performance data, monitoring data, cloud provider metadata, VM metadata and DBMS configuration data.
In addition the aggregation.xlsx
provides an abstracted view over all data points.
All configurable benchmark parameters are defined in the evaluationScenario.json
.
The benchANT_versions
contains the used versions of the benchANT software components to execute the benchmarks.
The execution logs of the individual benchmark steps are contained in airflowTaskInstanceDetails.json
.
The raw performance data output of the YCSB is contained in the 0_load.txt
for the LOAD phase and in the 0_run.txt
for the RUN phase. The database ranking data only considers the RUN phase.
In addition, the runtimeDataframe.xlsx
represents a cleaned time-series of the LOAD and RUN phase performance data.
The validate.json
provides a validation overview of the raw benchmark results.
The DBMS cluster and the benchmark instances are monitored with Telegraf and the data is stored in InfluxDB.
A full snapshot of the monitoring data of each run is contained in the influx_data.zip
file.
The time frame of the RUN phase for the relevant metrics is extracted in the dbmsMetrics.xlsx
.
The cloud provider metadata for the DBMS deployment is contained in the dbms_data_resources.json
/ dbms_management_resources.json
and for the benchmark deployment in the benchmark_resources.json
.
The VM metadata for the DBMS deployment is contained in the dbms_data_hardware_facts.json
/ dbms_management_hardware_facts.json
and for the benchmark deployment in the benchmark_hardware_facts.json
.
For each DBMS, relevant configuration files and cluster states are stored before executing the workload.
DBMS-specific files are contained in each folder, e.g. postgresql.conf
for PostgreSQL DBMS deployments.
In case of questions or feedback on the data feel free to reach out to [email protected]