Steps

Install and load the httpfs extension: INSTALL httpfs; LOAD httpfs;
Configure S3 credentials: SET s3_region='us-east-1'; SET s3_access_key_id='<key>'; SET s3_secret_access_key='<secret>'; or use SET s3_endpoint for MinIO/compatible stores
Read a Parquet file directly from S3: SELECT * FROM read_parquet('s3://my-bucket/data/events_2025.parquet') LIMIT 100
Use glob patterns to read multiple partitioned files: SELECT * FROM read_parquet('s3://my-bucket/data/year=2025/month=*/events.parquet')
Read a Parquet file over HTTPS without credentials: SELECT * FROM read_parquet('https://example.com/public/dataset.parquet')
Leverage projection pushdown by selecting only needed columns and predicate pushdown by adding WHERE clauses — DuckDB transmits only the required row groups and columns from the remote file

Known gotchas

The httpfs extension is not autoloaded in all DuckDB versions — explicitly run LOAD httpfs; at the start of each session unless autoload_known_extensions is enabled
Large remote Parquet scans are limited by network bandwidth and row group size; if the remote file lacks statistics in the Parquet footer, predicate pushdown cannot prune row groups
S3 credentials set via SET commands are session-scoped and not persisted; use a .duckdbrc file or DuckDB secrets manager (CREATE SECRET) for persistent credential configuration

duckdb.org · 5 steps · unrated

DuckDB query Parquet directly on S3

duckdb.org · 5 steps · unrated

Read a partitioned Parquet dataset with Hive partitioning in DuckDB

duckdb.org · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

Read remote Parquet files from S3 and HTTP sources in DuckDB using the httpfs extension

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?