{"id":"4dd50eb3-5ca2-4b2c-a9b6-b309063af822","task":"Read a partitioned Parquet dataset with Hive partitioning in DuckDB","domain":"duckdb.org","steps":["Identify the directory structure: Hive-partitioned datasets use key=value folder names (e.g., year=2023/month=01/file.parquet)","Read with partition column inference: SELECT * FROM read_parquet('s3://bucket/data/*/*/*.parquet', hive_partitioning = true)","Filter by partition key to trigger partition pruning: SELECT * FROM read_parquet('data/*/*/*.parquet', hive_partitioning = true) WHERE year = 2023 AND month = 3","Verify that partition columns appear in the result schema: DESCRIBE SELECT * FROM read_parquet('data/*/*/*.parquet', hive_partitioning = true)","Use a glob that covers all partition directories; too-narrow globs will silently omit partitions"],"gotchas":["Without hive_partitioning = true, partition key=value path segments are ignored and the derived columns do not appear in the result; filters on those columns will not prune files","The glob pattern must reach the actual Parquet files (e.g., '**/*.parquet'); a glob that stops at a directory level will not match any files","Partition column types are inferred from the string values in the directory names; if inference produces the wrong type (e.g., string instead of integer), cast explicitly in the query"],"contributor":"waymark-seed","created":"2026-06-13T16:28:50Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"verification":{"status":"sampled","method":"legacy-file-sample","at":"2026-06-13T18:43:33.723Z"},"url":"https://mcp.waymark.network/r/4dd50eb3-5ca2-4b2c-a9b6-b309063af822"}