WHATWG URLPattern API for matching and extracting components from URLs using pattern syntax
Maintainer(s):
teaguesterling
Installing and Loading
INSTALL urlpattern FROM community;
LOAD urlpattern;
Example
-- Test if a URL matches a pattern
SELECT urlpattern_test('https://example.com/users/:id', 'https://example.com/users/123');
-- true
-- Extract a named group from a URL
SELECT urlpattern_extract('https://example.com/users/:id', 'https://example.com/users/123', 'id');
-- '123'
-- Get full match results with all components
SELECT urlpattern_exec('/posts/:slug', '/posts/hello-world');
-- {matched: true, pathname: '/posts/hello-world', groups: {slug: 'hello-world'}, ...}
-- Use the URLPATTERN type for validated patterns
SELECT urlpattern_test(urlpattern('/api/:version/*'), '/api/v2/users/list');
-- true
About urlpattern
The URLPattern extension implements the WHATWG URLPattern API for DuckDB, enabling powerful URL matching, extraction, parsing, and construction.
Features
- Custom URLPATTERN type with validation and implicit casting from VARCHAR
- Pattern matching with named groups (
:name), wildcards (*), and regex groups - URL parsing into components (protocol, host, path, query, hash)
- URL building and modification from components
- Query parameter extraction as MAP or individual values
- Pattern caching for improved performance (~320k matches/sec)
Key Functions
| Function | Description |
|---|---|
urlpattern_test(pattern, url) |
Test if URL matches pattern |
urlpattern_extract(pattern, url, group) |
Extract a named group |
urlpattern_exec(pattern, url) |
Get full match results as STRUCT |
url_parse(url) |
Parse URL into struct with all components |
url_build(...) |
Build URL from named components |
url_modify(url, ...) |
Modify existing URL components |
url_search_params(url) |
Get query parameters as MAP |
Pattern Syntax
URLPattern uses a syntax similar to Express.js routes:
:name- Named parameter (matches any segment)*- Wildcard (matches everything)/path/:id- Path-only patterns match any protocol/host
For full documentation, see duckdb-urlpattern.readthedocs.io.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| url_build | scalar | NULL | NULL | |
| url_hash | scalar | NULL | NULL | |
| url_host | scalar | NULL | NULL | |
| url_hostname | scalar | NULL | NULL | |
| url_href | scalar | NULL | NULL | |
| url_modify | scalar | NULL | NULL | |
| url_origin | scalar | NULL | NULL | |
| url_parse | scalar | NULL | NULL | |
| url_password | scalar | NULL | NULL | |
| url_pathname | scalar | NULL | NULL | |
| url_port | scalar | NULL | NULL | |
| url_protocol | scalar | NULL | NULL | |
| url_resolve | scalar | NULL | NULL | |
| url_search | scalar | NULL | NULL | |
| url_search_param | scalar | NULL | NULL | |
| url_search_params | scalar | NULL | NULL | |
| url_username | scalar | NULL | NULL | |
| url_valid | scalar | NULL | NULL | |
| urlpattern | scalar | NULL | NULL | |
| urlpattern_exec | scalar | NULL | NULL | |
| urlpattern_extract | scalar | NULL | NULL | |
| urlpattern_hash | scalar | NULL | NULL | |
| urlpattern_hostname | scalar | NULL | NULL | |
| urlpattern_init | scalar | NULL | NULL | |
| urlpattern_pathname | scalar | NULL | NULL | |
| urlpattern_port | scalar | NULL | NULL | |
| urlpattern_protocol | scalar | NULL | NULL | |
| urlpattern_search | scalar | NULL | NULL | |
| urlpattern_test | scalar | NULL | NULL |
Added Types
| type_name | type_size | logical_type | type_category | internal |
|---|---|---|---|---|
| URLPATTERN | 16 | VARCHAR | STRING | true |