A partitioned gpu-backed dataframe, using Dask.
Setup from source repo:
-
Install dependencies into a new conda environment
conda install -n dask-cudf \ -c rapidsai -c numba -c conda-forge -c defaults \ pygdf dask distributed cudatoolkit
-
Activate conda environment:
source activate dask-cudf
-
Clone dask_gdf repo:
git clone https://github.com/rapidsai/dask-cudf
-
Install from source:
cd dask-cudf pip install .
-
Install
pytest
conda install pytest
-
Run all tests:
py.test dask_cudf
-
Or, run individual tests:
pytest dask_cudf/tests/test_file.py