Search Shortcut cmd + k | ctrl + k
Search cmd+k ctrl+k
0.10 (stable)
Loops

Loops can be used in sqllogictests when it is required to execute the same query many times but with slight modifications in constant values. For example, suppose we want to fire off 100 queries that check for the presence of the values 0..100 in a table:

# create the table 'integers' with values 0..100
statement ok
CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i);

# verify individually that all 100 values are there
loop i 0 100

# execute the query, replacing the value
query I
SELECT count(*) FROM integers WHERE i = ${i};
----
1

# end the loop (note that multiple statements can be part of a loop)
endloop

Similarly, foreach can be used to iterate over a set of values.

foreach partcode millennium century decade year quarter month day hour minute second millisecond microsecond epoch

query III
SELECT i, date_part('${partcode}', i) AS p, date_part(['${partcode}'], i) AS st
FROM intervals
WHERE p <> st['${partcode}'];
----

endloop

foreach also has a number of preset combinations that should be used when required. In this manner, when new combinations are added to the preset, old tests will automatically pick up these new combinations.

Preset Expansion
<compression> none uncompressed rle bitpacking dictionary fsst chimp patas
<signed> tinyint smallint integer bigint hugeint
<unsigned> utinyint usmallint uinteger ubigint uhugeint
<integral> <signed> <unsigned>
<numeric> <integral> float double
<alltypes> <numeric> bool interval varchar json

Use large loops sparingly. Executing hundreds of thousands of SQL statements will slow down tests unnecessarily. Do not use loops for inserting data.

Data Generation without Loops

Loops should be used sparingly. While it might be tempting to use loops for inserting data using insert statements, this will considerably slow down the test cases. Instead, it is better to generate data using the built-in range and repeat functions.

-- create the table integers with the values [0, 1, .., 98,  99]
CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i);

-- create the table strings with 100X the value "hello"
CREATE TABLE strings AS SELECT 'hello' AS s FROM range(0, 100, 1);

Using these two functions, together with clever use of cross products and other expressions, many different types of datasets can be efficiently generated. The RANDOM() function can also be used to generate random data.

An alternative option is to read data from an existing CSV or Parquet file. There are several large CSV files that can be loaded from the directory test/sql/copy/csv/data/real using a COPY INTO statement or the read_csv_auto function.

The TPC-H and TPC-DS extensions can also be used to generate synthetic data, using e.g. CALL dbgen(sf = 1) or CALL dsdgen(sf = 1).

About this page

Last modified: 2024-04-25