Installation

dask-sql can be installed via conda (preferred) or pip - or in a development environment.

You can continue with the Quickstart after the installation.

With conda

Create a new conda environment or use your already present environment:

conda create -n dask-sql
conda activate dask-sql

Install the package from the conda-forge channel:

conda install dask-sql -c conda-forge

GPU support

  • GPU support is currently tied to the RAPIDS libraries.

  • It generally requires the latest cuDF/Dask-cuDF nightlies.

Create a new conda environment or use an existing one to install RAPIDS with the chosen methods and packages. More details can be found on the RAPIDS Getting Started page, but as an example:

conda create --name rapids-env -c rapidsai-nightly -c nvidia -c conda-forge \
    cudf=22.10 dask-cudf=22.10 ucx-py ucx-proc=*=gpu python=3.9 cudatoolkit=11.8
conda activate rapids-env

Note that using UCX is mainly necessary if you have an Infiniband or NVLink enabled system. Refer to the UCX-Py docs for more information.

Install the stable package from the conda-forge channel:

conda install -c conda-forge dask-sql

Or the latest nightly from the dask channel (currently only available for Linux-based operating systems):

conda install -c dask/label/dev dask-sql

With pip

pip install dask-sql

For development

If you want to have the newest (unreleased) dask-sql version or if you plan to do development on dask-sql, you can also install the package from sources.

git clone https://github.com/dask-contrib/dask-sql.git

Create a new conda environment and install the development environment:

conda env create -f continuous_integration/environment-3.9.yaml

It is not recommended to use pip instead of conda.

After that, you can install the package in development mode

pip install -e ".[dev]"

To compile the Rust code (after changes), the above command must be rerun. You can run the tests (after installation) with

pytest tests

GPU-specific tests require additional dependencies specified in continuous_integration/gpuci/environment.yaml:

conda env create -n dask-sql-gpuci -f continuous_integration/gpuci/environment.yaml

GPU-specific tests can be run with

pytest tests -m gpu --rungpu

This repository uses pre-commit hooks. To install them, call

pre-commit install