Mark Raasveldt
2025-07-04 · 5 min
TL;DR: We released the DuckLake v0.2 standard, which is supported by DuckDB v1.3.1's ducklake extension.
We released DuckLake a little over a month ago, and we were overwhelmed by the positive response from the community. Naturally, we also received a bunch of feature requests. With DuckLake version 0.2, we are happy to present the improvements made since!
We would like to start with a quick reminder: the term DuckLake refers to both the DuckLake open standard and the ducklake DuckDB extension. In this blog post, we discuss the updates affecting both of these.
Updates in the DuckDB ducklake Extension
Secrets for Managing Ducklake Credentials
Secrets can now be used to manage DuckLake credentials. The secrets can be created using the standard secret infrastructure, and contain all information relevant to connect to a given DuckLake instance. When connecting to DuckLake, the secret name can be passed in as a parameter. The connection information is then loaded from the secret.
Named Secrets
You can pass a named secret to DuckLake as follows:
Unnamed Secrets
In addition to named secrets, there can also be one unnamed secret. When connecting using an empty connection string, this unnamed secret is used to connect:
Settings
This release adds support for many more settings for DuckLake, in particular around the writing of Parquet files. The below list of settings is now supported. Settings can be scoped either globally for the entire DuckLake, or set at a per-schema and per-table level. The settings are persisted to the DuckLake in the ducklake_metadata table.
- parquet_compression: Compression algorithm for Parquet files (uncompressed, snappy, gzip, zstd, brotli, lz4).
- parquet_version: Parquet format version (1 or 2).
- parquet_compression_level: Compression level for Parquet files.
- parquet_row_group_size: Number of rows per row group in Parquet files.
- parquet_row_group_size_bytes: Number of bytes per row group in Parquet files.
- target_file_size: The target data file size for insertion and compaction operations.
Settings can be set using the set_option method:
The settings for a given DuckLake can be read using the new options function.
List Files
We added the ducklake_list_files function. This function returns a list of data and corresponding delete files that must be scanned to scan the data for a table for a given snapshot. This can be helpful to integrate DuckLake into other systems.
Directly Attaching DuckLake
This release also adds support for directly attaching DuckLake using the duckdb CLI and other clients:
Updates in the DuckLake 0.2 Standard
Relative Schema/Table Paths
In the old DuckLake standard, paths were always only relative to the global data path. In the DuckLake v0.2 standard, the location to which data and delete files are written now have three layers:
- Data paths are relative to the table path
- Table paths are relative to the schema path
- Schema paths are relative to the global data path
This allows data files to be written in a more structured manner. By default, the schema and table name are set as the path to which the files are written. For example:
The paths are stored in the ducklake_table and ducklake_schema tables.
By writing all files for a given table in a given subdirectory, it is now possible to use prefix-based access control at the object store level to grant users access to only specific schemas or tables in the database.
Name Mapping, and Adding Existing Parquet Files
DuckLake by default uses field ids to perform column mapping. When writing Parquet files through DuckLake, each column contains a field id that indicates to which top-level column it belongs. This allows DuckLake to perform metadata-only alter operations such as renaming and dropping fields.
When registering existing Parquet files written through other means or by other writers, these files generally do not have field identifiers written to them. In order to facilitate using these Parquet files in DuckLake, v0.2 adds a new way of mapping columns in the form of name mapping. This allows registering Parquet files as follows:
Every file that is added to DuckLake has an optional mapping_id, that tells the system the mapping of column name to field id. As this is done on a per-file basis, Parquet files can always be added to DuckLake without restriction. All DuckLake operations are supported on added files, including schema evolution and data change feeds.
Settings
DuckLake v0.2 adds support for scoped settings. The settings stored in the ducklake_metadata table now have an optional scope field, together with a scope_id. This allows settings to be scoped at either the schema or table level – instead of requiring all settings to be scoped globally.
Partition Transforms
This release adds support for the year/month/day/hour partition transforms, allowing these functions to be used to partition data directly instead of having to create separate columns containing these fields.
Migration Guide
When connecting DuckDB to a DuckLake that runs v0.1, a migration is automatically triggered that updates the DuckLake to v0.2. The following SQL queries are issued in order to perform the migration.
.png)


