Summarize

Search Shortcut cmd + k | ctrl + k

Documentation / Guides / Meta Queries

Summarize

The SUMMARIZE command can be used to easily compute a number of aggregates over a table or a query. The SUMMARIZE command launches a query that computes a number of aggregates over all columns (min, max, approx_unique, avg, std, q25, q50, q75, count), and return these along the column name, column type, and the percentage of NULL values in the column.

Usage

In order to summarize the contents of a table, use SUMMARIZE followed by the table name.

SUMMARIZE tbl;

In order to summarize a query, prepend SUMMARIZE to a query.

SUMMARIZE SELECT * FROM tbl;

Example

Below is an example of SUMMARIZE on the lineitem table of TPC-H SF1 table, generated using the tpch extension.

INSTALL tpch;
LOAD tpch;
CALL dbgen(sf = 1);

SUMMARIZE lineitem;

column_name	column_type	min	max	approx_unique	avg	std	q25	q50	q75	count	null_percentage
l_orderkey	INTEGER	1	6000000	1508227	3000279.604204982	1732187.8734803519	1509447	2989869	4485232	6001215	0.0%
l_partkey	INTEGER	1	200000	202598	100017.98932999402	57735.69082650496	49913	99992	150039	6001215	0.0%
l_suppkey	INTEGER	1	10000	10061	5000.602606138924	2886.9619987306114	2501	4999	7500	6001215	0.0%
l_linenumber	INTEGER	1	7	7	3.0005757167506912	1.7324314036519328	2	3	4	6001215	0.0%
l_quantity	DECIMAL(15,2)	1.00	50.00	50	25.507967136654827	14.426262537016918	13	26	38	6001215	0.0%
l_extendedprice	DECIMAL(15,2)	901.00	104949.50	923139	38255.138484656854	23300.43871096221	18756	36724	55159	6001215	0.0%
l_discount	DECIMAL(15,2)	0.00	0.10	11	0.04999943011540163	0.03161985510812596	0	0	0	6001215	0.0%
l_tax	DECIMAL(15,2)	0.00	0.08	9	0.04001350893110812	0.025816551798842728	0	0	0	6001215	0.0%
l_returnflag	VARCHAR	A	R	3	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_linestatus	VARCHAR	F	O	2	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_shipdate	DATE	1992-01-02	1998-12-01	2516	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_commitdate	DATE	1992-01-31	1998-10-31	2460	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_receiptdate	DATE	1992-01-04	1998-12-31	2549	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_shipinstruct	VARCHAR	COLLECT COD	TAKE BACK RETURN	4	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_shipmode	VARCHAR	AIR	TRUCK	7	NULL	NULL	NULL	NULL	NULL	6001215	0.0%
l_comment	VARCHAR	Tiresias	zzle? furiously iro	3558599	NULL	NULL	NULL	NULL	NULL	6001215	0.0%

Using `SUMMARIZE` in a Subquery

SUMMARIZE can be used a subquery. This allows creating a table from the summary, for example:

CREATE TABLE tbl_summary AS SELECT * FROM (SUMMARIZE tbl);

Summarizing Remote Tables

It is possible to summarize remote tables via the httpfs extension using the SUMMARIZE TABLE statement. For example:

SUMMARIZE TABLE 'https://blobs.duckdb.org/data/Star_Trek-Season_1.csv';

Usage

Example

Using `SUMMARIZE` in a Subquery

Summarizing Remote Tables

About this page

In this article

Usage

Example

Using SUMMARIZE in a Subquery

Summarizing Remote Tables

About this page

In this article

Using `SUMMARIZE` in a Subquery