Install the DuckDB Iceberg extension: INSTALL iceberg; LOAD iceberg; then query an Iceberg table by pointing to its metadata.json path: SELECT * FROM iceberg_scan('path/to/table/metadata/v1.metadata.json').
For Delta Lake tables, install the Delta extension: INSTALL delta; LOAD delta; then use delta_scan('path/to/delta/table') to read the latest snapshot based on the _delta_log transaction log.
For S3-hosted tables, configure DuckDB's httpfs extension with your credentials (SET s3_region, s3_access_key_id, s3_secret_access_key using short placeholder values in config files) before calling iceberg_scan or delta_scan with an s3:// URI.
Use DuckDB's COPY ... TO syntax or export to Parquet for sharing query results without running a cluster; combine with local Parquet files to join warehouse data with local datasets.
For Iceberg, use iceberg_metadata('path') and iceberg_snapshots('path') helper functions to inspect snapshot history and file-level statistics without scanning data.
Pin DuckDB and extension versions in your development environment; extension APIs for Iceberg and Delta change across minor versions and may break existing scan paths.
Known gotchas
DuckDB reads Iceberg and Delta via direct file access and does not use a catalog service; schema changes committed after the metadata file path you supply may not be reflected until you re-resolve the latest metadata path.
DuckDB's Delta extension uses the delta-kernel-rs library; complex Delta features like deletion vectors or column mapping mode may not be fully supported in older extension versions—check the release notes.
DuckDB is single-process and in-memory by default; scanning very large Iceberg/Delta tables without a spill-to-disk configuration (SET temp_directory) will exhaust RAM on large datasets.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp