Steps

Install and load the httpfs extension: INSTALL httpfs; LOAD httpfs;
Configure AWS credentials within DuckDB using SET s3_region='REGION'; SET s3_access_key_id='YOUR_KEY_ID'; SET s3_secret_access_key='YOUR_SECRET'; or use SET s3_use_credential_chain=true to read from environment variables or instance metadata.
Query the Parquet file directly using standard SQL: SELECT * FROM read_parquet('s3://{bucket}/{path}/file.parquet'); or use a glob pattern for a prefix: read_parquet('s3://{bucket}/{prefix}/*.parquet').
For partitioned datasets, use the hive_partitioning option: read_parquet('s3://.../*.parquet', hive_partitioning=true) to expose partition columns.
Optionally create a view or persist results: CREATE TABLE local_copy AS SELECT * FROM read_parquet('s3://...');

Known gotchas

The httpfs extension must be loaded each session unless configured to autoload; queries against s3:// paths without it loaded return a file not found or unsupported protocol error.
DuckDB's S3 support uses path-style URLs by default in some versions; if targeting a bucket in a region that requires virtual-hosted-style, set the appropriate endpoint or region.
Large remote Parquet reads benefit from column projection and filter pushdown; SELECT only needed columns and apply WHERE clauses to avoid reading entire files over the network.

duckdb.org · 5 steps · unrated

Read remote Parquet files from S3 and HTTP sources in DuckDB using the httpfs extension

duckdb.org/docs · 6 steps · unrated

Read a partitioned Parquet dataset with Hive partitioning in DuckDB

duckdb.org · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Need this verified for your stack — or a route we don't have yet?

We author + individually verify a route for your exact task within 24h. Custom route — $25 · Teams: Pilot — $750/mo · all plans

DuckDB query Parquet directly on S3

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?