WHATWG URLPattern API for matching and extracting components from URLs using pattern syntax
Maintainer(s):
teaguesterling
Installing and Loading
INSTALL urlpattern FROM community;
LOAD urlpattern;
Example
-- Test if a URL matches a pattern
SELECT urlpattern_test('https://example.com/users/:id', 'https://example.com/users/123');
-- true
-- Extract a named group from a URL
SELECT urlpattern_extract('https://example.com/users/:id', 'https://example.com/users/123', 'id');
-- '123'
-- Get full match results with all components
SELECT urlpattern_exec('/posts/:slug', '/posts/hello-world');
-- {matched: true, pathname: '/posts/hello-world', groups: {slug: 'hello-world'}, ...}
-- Use the URLPATTERN type for validated patterns
SELECT urlpattern_test(urlpattern('/api/:version/*'), '/api/v2/users/list');
-- true
About urlpattern
The URLPattern extension implements the WHATWG URLPattern API for DuckDB, enabling powerful URL matching and component extraction using a standard pattern syntax.
Features
- Custom URLPATTERN type with validation and implicit casting from VARCHAR
- Pattern matching with named groups (
:name), wildcards (*), and regex groups - Component extraction for protocol, hostname, port, pathname, search, and hash
- Pattern caching for improved performance on repeated patterns
Functions
| Function | Description |
|---|---|
urlpattern(pattern) |
Create a validated URLPATTERN |
urlpattern_test(pattern, url) |
Test if URL matches pattern |
urlpattern_extract(pattern, url, group) |
Extract a named group |
urlpattern_exec(pattern, url) |
Get full match results as STRUCT |
urlpattern_pathname(pattern) |
Get pathname component of pattern |
urlpattern_protocol(pattern) |
Get protocol component of pattern |
urlpattern_hostname(pattern) |
Get hostname component of pattern |
Pattern Syntax
URLPattern uses a syntax similar to Express.js routes:
:name- Named parameter (matches any segment)*- Wildcard (matches everything):name(regex)- Named parameter with regex constraint{optionalGroup}- Optional group
For full pattern syntax, see the URLPattern specification.
Added Functions
| function_name | function_type | description | comment | examples |
|---|---|---|---|---|
| urlpattern | scalar | NULL | NULL | |
| urlpattern_exec | scalar | NULL | NULL | |
| urlpattern_extract | scalar | NULL | NULL | |
| urlpattern_hash | scalar | NULL | NULL | |
| urlpattern_hostname | scalar | NULL | NULL | |
| urlpattern_pathname | scalar | NULL | NULL | |
| urlpattern_port | scalar | NULL | NULL | |
| urlpattern_protocol | scalar | NULL | NULL | |
| urlpattern_search | scalar | NULL | NULL | |
| urlpattern_test | scalar | NULL | NULL |
Added Types
| type_name | type_size | logical_type | type_category | internal |
|---|---|---|---|---|
| URLPATTERN | 16 | VARCHAR | STRING | true |