Metadata Information
====================
With these operations, it is possible to get information on the currently registered tables
and their columns.
The output format is mostly compatible with the presto format.
.. raw:: html
SHOW SCHEMAS
SHOW TABLES FROM <schema-name>
SHOW COLUMNS FROM <table-name>
DESCRIBE <table-name>
ANALYZE TABLE <table-name> COMPUTE STATISTICS
[ FOR ALL COLUMNS | FOR COLUMNS <column>, [ ,... ] ]
See :ref:`sql` for information on how to reference schemas and tables correctly.
``SHOW SCHEMAS``
----------------
Show the schemas registered in ``dask-sql``.
Only included for compatibility reasons.
There is always just a one called "schema", where all the data is located and an additional schema, called "information_schema",
which is needed by some BI tools (which is empty).
Example:
.. raw:: html
Result:
+------------------------+
| Schema |
+========================+
| schema |
+------------------------+
| information_schema |
+------------------------+
``SHOW TABLES``
---------------
Show the registered tables in a given schema.
Example:
.. raw:: html
SHOW TABLES FROM "schema"
Result:
+------------+
| Table |
+============+
| timeseries |
+------------+
``SHOW COLUMNS`` and ``DESCRIBE``
---------------------------------
Show column information on a specific table.
Example:
.. raw:: html
SHOW COLUMNS FROM "timeseries"
Result:
+--------+---------+---------------+
| Column | Type | Extra Comment |
+========+=========+===============+
| id | bigint | |
+--------+---------+---------------+
| name | varchar | |
+--------+---------+---------------+
| x | double | |
+--------+---------+---------------+
| y | double | |
+--------+---------+---------------+
The column "Extra Comment" is shown for compatibility with presto.
``ANALYZE TABLE``
-----------------
Calculate statistics on a given table (and the given columns or all columns)
and return it as a query result.
Please note, that this process can be time consuming on large tables.
Even though this statement is very similar to the ``ANALYZE TABLE`` statement in e.g. `Apache Spark `_, it does not optimize subsequent queries (as the pendent in Spark will do).
Example:
.. raw:: html
ANALYZE TABLE "timeseries" COMPUTE STATISTICS FOR COLUMNS x, y
Result:
+-----------+-----------+-----------+
| | x | y |
+===========+===========+===========+
| count | 30 | 30 |
+-----------+-----------+-----------+
| mean | 0.140374 | -0.107481 |
+-----------+-----------+-----------+
| std | 0.568248 | 0.573106 |
+-----------+-----------+-----------+
| min | -0.795112 | -0.966043 |
+-----------+-----------+-----------+
| 25% | -0.379635 | -0.561234 |
+-----------+-----------+-----------+
| 50% | 0.0104101 | -0.237795 |
+-----------+-----------+-----------+
| 75% | 0.70208 | 0.263459 |
+-----------+-----------+-----------+
| max | 0.990747 | 0.947069 |
+-----------+-----------+-----------+
| data_type | double | double |
+-----------+-----------+-----------+
| col_name | x | y |
+-----------+-----------+-----------+