Metadata Information

With these operations, it is possible to get information on the currently registered tables and their columns. The output format is mostly compatible with the presto format.

SHOW SCHEMAS
SHOW TABLES FROM <schema-name>
SHOW COLUMNS FROM <table-name>
DESCRIBE <table-name>
ANALYZE TABLE <table-name> COMPUTE STATISTICS
    [ FOR ALL COLUMNS | FOR COLUMNS <column>, [ ,... ] ]

See SQL Syntax for information on how to reference schemas and tables correctly.

SHOW SCHEMAS

Show the schemas registered in dask-sql. Only included for compatibility reasons. There is always just a one called “schema”, where all the data is located and an additional schema, called “information_schema”, which is needed by some BI tools (which is empty).

Example:

SHOW SCHEMAS

Result:

Schema

schema

information_schema

SHOW TABLES

Show the registered tables in a given schema.

Example:

SHOW TABLES FROM "schema"

Result:

Table

timeseries

SHOW COLUMNS and DESCRIBE

Show column information on a specific table.

Example:

SHOW COLUMNS FROM "timeseries"

Result:

Column

Type

Extra Comment

id

bigint

name

varchar

x

double

y

double

The column “Extra Comment” is shown for compatibility with presto.

ANALYZE TABLE

Calculate statistics on a given table (and the given columns or all columns) and return it as a query result. Please note, that this process can be time consuming on large tables. Even though this statement is very similar to the ANALYZE TABLE statement in e.g. Apache Spark, it does not optimize subsequent queries (as the pendent in Spark will do).

Example:

ANALYZE TABLE "timeseries" COMPUTE STATISTICS FOR COLUMNS x, y

Result:

x

y

count

30

30

mean

0.140374

-0.107481

std

0.568248

0.573106

min

-0.795112

-0.966043

25%

-0.379635

-0.561234

50%

0.0104101

-0.237795

75%

0.70208

0.263459

max

0.990747

0.947069

data_type

double

double

col_name

x

y