Testing

Testing is vital to make sure that DuckDB works properly and keeps working properly. For that reason, we put a large emphasis on thorough and frequent testing. We run a batch of small tests on every commit using Travis CI, and run a more exhaustive batch of tests on pull requests and commits in the master branch.

It is crucial that any new features that get added have correct tests that not only test the “happy path”, but also test edge cases and incorrect usage of the feature. In this section, we describe how DuckDB tests are structured and how to make new tests for DuckDB.

The tests can be run by running the unittest program located in the test folder. For the default compilations this is located in either build/release/test/unittest (release) or build/debug/test/unittest (debug).

SQLLogicTests

When testing DuckDB, we aim to route all the tests through SQL. We try to avoid testing components individually because that makes those components more difficult to change later on. As such, almost all of our tests can (and should) be expressed in pure SQL. There are certain exceptions to this, which we will discuss in the section “Catch Tests”. However, in most cases you should write your tests in plain SQL.

For testing plain SQL we use an extended version of the SQL logic test suite, adopted from SQLite. Every test is a single self-contained file located in the test/sql directory. The test describes a series of SQL statements, together with either the expected result, a statement ok indicator, or a statement error indicator. An example of a test file is shown below:

# name: test/sql/projection/test_simple_projection.test
# group [projection]

# create table
statement ok
CREATE TABLE a (i integer, j integer);

# insertion: 1 affected row
statement ok
INSERT INTO a VALUES (42, 84);

query II
SELECT * FROM a;
----
42	84

In this example, three statements are executed. The first statements are expected to succeed (prefixed by statement ok). The third statement is expected to return a single row with two columns (indicated by query II). The values of the row are expected to be 42 and 84 (separated by a tab character).

The top of every file should contain a comment describing the name and group of the test. The name of the test is always the relative file path of the file. The group is the folder that the file is in. The name and group of the test are relevant because they can be used to execute only that test in the unittest group. For example, if we wanted to execute only the above test, we would run the command unittest test/sql/projection/test_simple_projection.test. If we wanted to run all tests in a specific directory, we would run the command unittest [projection].

Any tests that are placed in the test directory are automatically added to the test suite. Note that the extension of the test is significant. SQLLogicTests should either use the .test extension, or the .test_slow extension. The .test_slow extension indicates that the test takes a while to run, and will only be run when all tests are explicitly run using unittest *. Tests with the extension .test will be included in the fast set of tests.

Row-Wise vs Value-Wise Result Ordering

The result values of a query can be either supplied in row-wise order, with the individual values separated by tabs, or in value-wise order. In value wise order the individual values of the query must appear in row, column order each on an individual line. Consider the following example in both row-wise and value-wise order:

# row-wise
query II
SELECT 42, 84 UNION ALL SELECT 10, 20;
----
42	84
10	20

# value-wise
query II
SELECT 42, 84 UNION ALL SELECT 10, 20;
----
42
84
10
20
NULL Values and empty strings

Empty lines have special significance for the SQLLogic test runner: they signify an end of the current statement or query. For that reason, empty strings and NULL values have special syntax that must be used in result verification. NULL values should use the string NULL, and empty strings should use the string (empty), e.g.:

query II
SELECT NULL, ''
----
NULL
(empty)
Hashes & Outputting Values

Besides direct result verification, the sqllogic test suite also has the option of using MD5 hashes for value comparisons. A test using hashes for result verification looks like this:

query I
SELECT g, STRING_AGG(x,',') FROM strings GROUP BY g
----
200 values hashing to b8126ea73f21372cdb3f2dc483106a12

This approach is useful for reducing the size of tests when results have many output rows. However, it should be used sparingly, as hash values make the tests more difficult to debug if they do break.

After it is ensured that the system outputs the correct result, hashes of the queries in a test file can be computed by adding mode output_hash to the test file. For example:

mode output_hash

query II
SELECT 42, 84 UNION ALL SELECT 10, 20;
----
42	84
10	20

The expected output hashes for every query in the test file will then be printed to the terminal, as follows:

================================================================================
SQL Query
SELECT 42, 84 UNION ALL SELECT 10, 20;
================================================================================
4 values hashing to 498c69da8f30c24da3bd5b322a2fd455
================================================================================

In a similar manner, mode output_result can be used in order to force the program to print the result to the terminal for every query run in the test file.

Multiple Connections

For tests whose purpose is to verify that the transactional management or versioning of data works correctly, it is generally necessary to use multiple connections. For example, if we want to verify that the creation of tables is correctly transactional, we might want to start a transaction and create a table in con1, then fire a query in con2 that checks that the table is not accessible yet until committed.

We can use multiple connections in the sqllogictests using connection labels. The connection label can be optionally appended to any statement or query. All queries with the same connection label will be executed in the same connection. A test that would verify the above property would look as follows:

statement ok con1
BEGIN TRANSACTION

statement ok con1
CREATE TABLE integers(i INTEGER);

statement error con2
SELECT * FROM integers;
Result Sorting

Queries can have an optional field that indicates that the result should be sorted in a specific manner. This field goes in the same location as the connection label. Because of that, connection labels and result sorting cannot be mixed.

The possible values of this field are nosort, rowsort and valuesort. An example of how this might be used is given below:

query I rowsort
SELECT 'world' UNION ALL SELECT 'hello'
----
hello
world

In general, we prefer not to use this field and rely on ORDER BY in the query to generate deterministic query answers. However, existing sqllogictests use this field extensively, hence it is important to know of its existance.

Query Labels

Another feature that can be used for result verification are query labels. These can be used to verify that different queries provide the same result. This is useful for comparing queries that are logically equivalent, but formulated differently. Query labels are provided after the connection label or sorting specifier.

Queries that have a query label do not need to have a result provided. Instead, the results of each of the queries with the same label are compared to each other. For example, the following script verifies that the queries SELECT 42+1 and SELECT 44-1 provide the same result:

query I nosort r43
SELECT 42+1;
----

query I nosort r43
SELECT 44-1;
----
Query Verification

Many simple tests start by enabling query verification. This can be done through the following PRAGMA statement:

statement ok
PRAGMA enable_verification

Query verification performs extra validation to ensure that the underlying code runs correctly. The most important part of that is that it verifies that optimizers do not cause bugs in the query. It does this by running both an unoptimized and optimized version of the query, and verifying that the results of these queries are identical.

Query verification is very useful because it not only discovers bugs in optimizers, but also finds bugs in e.g. join implementations. This is because the unoptimized version will typically run using cross products instead. Because of this, query verification can be very slow to do when working with larger data sets. It is therefore recommended to turn on query verification for all unit tests, except those involving larger data sets (more than 10-100~ rows).

Temporary Files

For some tests (e.g. CSV/Parquet file format tests) it is necessary to create temporary files. Any temporary files should be created in the temporary testing directory. This directory can be used by placing the string __TEST_DIR__ in a query. This string will be replaced by the path of the temporary testing directory.

Require & Extensions

To avoid bloating the core system, certain functionality of DuckDB is available only as an extension. Tests can be build for those extensions by adding a require field in the test. If the extension is not loaded, any statements that occurs after the require field will be skipped. Examples of this are require parquet or require icu.

Another usage is to limit a test to a specific vector size. For example, adding require vector_size 512 to a test will prevent the test from being run unless the vector size greater than or equal to 512. This is useful because certain functionality is not supported for low vector sizes, but we run tests using a vector size of 2 in our CI.

Loops

Loops can be used in sqllogictests when it is required to execute the same query many times but with slight modifications in constant values. Only simple, non-nested, loops are supported. For example, suppose we want to fire off 100 queries that check for the presence of the values 0..100 in a table:

# create the table integers with the values 0..100
statement ok
CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i);

# verify individually that all 100 values are there
loop i 0 100

# execute the query, replacing the value
query I
SELECT COUNT(*) FROM integers WHERE i=${i};
----
1

# end the loop (note that multiple statements can be part of a loop)
endloop
Data Generation

Loops should be used sparingly. While it might be tempting to use loops for inserting data using INSERT INTO statements, this will considerably slow down the test cases. Instead, it is better to generate data using the built-in range and repeat functions.

-- create the table integers with the values [0, 1, .., 98,  99]
CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i);

-- create the table strings with 100X the value "hello"
CREATE TABLE strings AS SELECT * FROM repeat('hello', 100) t2(s);

Using these two functions, together with clever use of cross products and other expressions, many different types of datasets can be efficiently generated. The RANDOM() function can also be used to generate random data.

An alternative option is to read data from an existing CSV file. There are several large CSV files that can be loaded from the directory test/sql/copy/csv/data/real using a COPY INTO statement or the read_csv_auto function.

Debugging

The purpose of the tests is to figure out when things break. Inevitably changes made to the system will cause one of the tests to fail, and when that happens the test needs to be debugged.

First, it is always recommended to run in debug mode. This can be done by compiling the system using the command make debug. Second, it is recommended to only run the test that breaks. This can be done by passing the filename of the breaking test to the test suite as a command line parameter (e.g. build/debug/test/unittest test/sql/projection/test_simple_projection.test).

After that, a debugger can be attached to the program and the test can be debugged. In the sqllogictests it is normally difficult to break on a specific query, however, we have expanded the test suite so that a function called query_break is called with the line number line as parameter for every query that is run. This allows you to put a conditional breakpoint on a specific query. For example, if we want to break on line number 43 of the test file we can create the following break point:

gdb: break query_break if line==43
lldb: break -n query_break -c line==43

You can also skip certain queries from executing by placing mode skip in the file, followed by an optional mode unskip. Any queries between the two statements will not be executed.

Editors & Syntax Highlighting

The SQLLogicTests are not exactly an industry standard, but several other systems have adopted them as well. Parsing sqllogictests is intentionally simple. All statements have to be separated by empty lines. For that reason, writing a syntax highlighter is not extremely difficult.

A syntax highlighter exists for Visual Studio Code. We have also made a fork that supports the DuckDB dialect of the sqllogictests. You can use the fork by installing the original, then copying the syntaxes/sqllogictest.tmLanguage.json into the installed extension (on MacOS this is located in ~/.vscode/extensions/benesch.sqllogictest-0.1.1).

Catch Tests

While we prefer the sqllogic tests for testing most functionality, for certain tests only SQL is not sufficient. This typically happens when you want to test the C++ API, when you want to stress test the system (e.g. using multiple concurrent threads) or when you want to test persistent storage involving database restarts. When using pure SQL is really not an option it might be necessary to make a C++ test using Catch.

Catch tests reside in the test directory as well. Here is an example of a catch test that tests the storage of the system:

#include "catch.hpp"
#include "test_helpers.hpp"

TEST_CASE("Test simple storage", "[storage]") {
	auto config = GetTestConfig();
	unique_ptr<QueryResult> result;
	auto storage_database = TestCreatePath("storage_test");

	// make sure the database does not exist
	DeleteDatabase(storage_database);
	{
		// create a database and insert values
		DuckDB db(storage_database, config.get());
		Connection con(db);
		REQUIRE_NO_FAIL(con.Query("CREATE TABLE test (a INTEGER, b INTEGER);"));
		REQUIRE_NO_FAIL(con.Query("INSERT INTO test VALUES (11, 22), (13, 22), (12, 21), (NULL, NULL)"));
		REQUIRE_NO_FAIL(con.Query("CREATE TABLE test2 (a INTEGER);"));
		REQUIRE_NO_FAIL(con.Query("INSERT INTO test2 VALUES (13), (12), (11)"));
	}
	// reload the database from disk a few times
	for (idx_t i = 0; i < 2; i++) {
		DuckDB db(storage_database, config.get());
		Connection con(db);
		result = con.Query("SELECT * FROM test ORDER BY a");
		REQUIRE(CHECK_COLUMN(result, 0, {Value(), 11, 12, 13}));
		REQUIRE(CHECK_COLUMN(result, 1, {Value(), 22, 21, 22}));
		result = con.Query("SELECT * FROM test2 ORDER BY a");
		REQUIRE(CHECK_COLUMN(result, 0, {11, 12, 13}));
	}
	DeleteDatabase(storage_database);
}

The test uses the TEST_CASE wrapper to create each test. The database is created and queried using the C++ API. Results are checked using either REQUIRE_FAIL/REQUIRE_NO_FAIL (corresponding to statement ok and statement error) or REQUIRE(CHECK_COLUMN(...)) (corresponding to query with a result check). Every test that is created in this way needs to be added to the corresponding CMakeLists.txt