Metadata Information
Contents
Metadata Information¶
With these operations, it is possible to get information on the currently registered tables and their columns. The output format is mostly compatible with the presto format.
SHOW SCHEMAS
SHOW TABLES FROM <schema-name>
SHOW COLUMNS FROM <table-name>
DESCRIBE <table-name>
ANALYZE TABLE <table-name> COMPUTE STATISTICS
[ FOR ALL COLUMNS | FOR COLUMNS <column>, [ ,... ] ]
See SQL Syntax for information on how to reference schemas and tables correctly.
SHOW SCHEMAS
¶
Show the schemas registered in dask-sql
.
Only included for compatibility reasons.
There is always just a one called “schema”, where all the data is located and an additional schema, called “information_schema”,
which is needed by some BI tools (which is empty).
Example:
SHOW SCHEMAS
Result:
Schema |
---|
schema |
information_schema |
SHOW TABLES
¶
Show the registered tables in a given schema.
Example:
SHOW TABLES FROM "schema"
Result:
Table |
---|
timeseries |
SHOW COLUMNS
and DESCRIBE
¶
Show column information on a specific table.
Example:
SHOW COLUMNS FROM "timeseries"
Result:
Column |
Type |
Extra Comment |
---|---|---|
id |
bigint |
|
name |
varchar |
|
x |
double |
|
y |
double |
The column “Extra Comment” is shown for compatibility with presto.
ANALYZE TABLE
¶
Calculate statistics on a given table (and the given columns or all columns)
and return it as a query result.
Please note, that this process can be time consuming on large tables.
Even though this statement is very similar to the ANALYZE TABLE
statement in e.g. Apache Spark, it does not optimize subsequent queries (as the pendent in Spark will do).
Example:
ANALYZE TABLE "timeseries" COMPUTE STATISTICS FOR COLUMNS x, y
Result:
x |
y |
|
---|---|---|
count |
30 |
30 |
mean |
0.140374 |
-0.107481 |
std |
0.568248 |
0.573106 |
min |
-0.795112 |
-0.966043 |
25% |
-0.379635 |
-0.561234 |
50% |
0.0104101 |
-0.237795 |
75% |
0.70208 |
0.263459 |
max |
0.990747 |
0.947069 |
data_type |
double |
double |
col_name |
x |
y |