When testing DuckDB, we aim to route all the tests through SQL. We try to avoid testing components individually because that makes those components more difficult to change later on. As such, almost all of our tests can (and should) be expressed in pure SQL. There are certain exceptions to this, which we will discuss in the section “Catch Tests”. However, in most cases you should write your tests in plain SQL.
For testing plain SQL we use an extended version of the SQL logic test suite, adopted from SQLite. Every test is a single self-contained file located in the
test/sql directory. The test describes a series of SQL statements, together with either the expected result, a
statement ok indicator, or a
statement error indicator. An example of a test file is shown below:
# name: test/sql/projection/test_simple_projection.test # group [projection] # enable query verification statement ok PRAGMA enable_verification # create table statement ok CREATE TABLE a (i integer, j integer); # insertion: 1 affected row statement ok INSERT INTO a VALUES (42, 84); query II SELECT * FROM a; ---- 42 84
In this example, three statements are executed. The first statements are expected to succeed (prefixed by
statement ok). The third statement is expected to return a single row with two columns (indicated by
query II). The values of the row are expected to be
84 (separated by a tab character). For more information on query result verification, see the result verification section.
The top of every file should contain a comment describing the name and group of the test. The name of the test is always the relative file path of the file. The group is the folder that the file is in. The name and group of the test are relevant because they can be used to execute only that test in the unittest group. For example, if we wanted to execute only the above test, we would run the command
unittest test/sql/projection/test_simple_projection.test. If we wanted to run all tests in a specific directory, we would run the command
Any tests that are placed in the
test directory are automatically added to the test suite. Note that the extension of the test is significant. SQLLogicTests should either use the
.test extension, or the
.test_slow extension. The
.test_slow extension indicates that the test takes a while to run, and will only be run when all tests are explicitly run using
unittest *. Tests with the extension
.test will be included in the fast set of tests.
Many simple tests start by enabling query verification. This can be done through the following
statement ok PRAGMA enable_verification
Query verification performs extra validation to ensure that the underlying code runs correctly. The most important part of that is that it verifies that optimizers do not cause bugs in the query. It does this by running both an unoptimized and optimized version of the query, and verifying that the results of these queries are identical.
Query verification is very useful because it not only discovers bugs in optimizers, but also finds bugs in e.g. join implementations. This is because the unoptimized version will typically run using cross products instead. Because of this, query verification can be very slow to do when working with larger data sets. It is therefore recommended to turn on query verification for all unit tests, except those involving larger data sets (more than 10-100~ rows).
The SQLLogicTests are not exactly an industry standard, but several other systems have adopted them as well. Parsing sqllogictests is intentionally simple. All statements have to be separated by empty lines. For that reason, writing a syntax highlighter is not extremely difficult.
A syntax highlighter exists for Visual Studio Code. We have also made a fork that supports the DuckDB dialect of the sqllogictests. You can use the fork by installing the original, then copying the
syntaxes/sqllogictest.tmLanguage.json into the installed extension (on MacOS this is located in
A syntax highlighter is also available for CLion. It can be installed directly on the IDE by searching SQLTest on the marketplace. A github repository is also available, with extensions and bug reports being welcome.
For some tests (e.g. CSV/Parquet file format tests) it is necessary to create temporary files. Any temporary files should be created in the temporary testing directory. This directory can be used by placing the string
__TEST_DIR__ in a query. This string will be replaced by the path of the temporary testing directory.
statement ok COPY csv_data TO '__TEST_DIR__/output_file.csv.gz' (COMPRESSION GZIP);
To avoid bloating the core system, certain functionality of DuckDB is available only as an extension. Tests can be build for those extensions by adding a
require field in the test. If the extension is not loaded, any statements that occurs after the require field will be skipped. Examples of this are
require parquet or
Another usage is to limit a test to a specific vector size. For example, adding
require vector_size 512 to a test will prevent the test from being run unless the vector size greater than or equal to 512. This is useful because certain functionality is not supported for low vector sizes, but we run tests using a vector size of 2 in our CI.