{"id":"6550d8d7-5da0-4048-8e38-ab8832b2bc97","task":"Configure a Hudi Record-Level Index (RLI) to accelerate upsert lookup performance on large tables","domain":"hudi.apache.org","steps":["Choose the Record-Level Index: set hoodie.index.type=RECORD_INDEX in your Hudi write configuration; RLI stores a mapping of record key to file group location in a dedicated metadata table partition, enabling O(1) lookup without scanning all partition files","Enable the Hudi metadata table as a prerequisite: hoodie.metadata.enable=true and hoodie.metadata.record.index.enable=true; the metadata table must be bootstrapped on the first write or through a metadata initialization job for existing tables","On the first write with RLI enabled, Hudi initializes the record index by scanning all existing data files to build the mapping; this is a one-time cost — allow extra time for large existing tables","Verify RLI is active by checking the .hoodie/metadata directory for a record_index partition; subsequent upserts should show reduced lookup time in the write metrics (hoodie_write_*_lookup_duration metrics if emitting to your metrics system)","For Spark, ensure the Hudi Spark bundle version supports RLI (added in Hudi 0.14+); earlier index types like BLOOM or SIMPLE remain available for compatibility but have higher per-file scanning costs on large tables"],"gotchas":["RLI increases metadata table size proportional to total record count; for very large tables (billions of records) the metadata table itself requires storage and read capacity — plan accordingly","If the metadata table becomes inconsistent (e.g., due to a failed write), RLI lookups may return stale file locations causing missed upserts or duplicate inserts; use the Hudi CLI validate-metadata command to check consistency and repair if needed","RLI is incompatible with Hudi's PARTITIONED and GLOBAL_BLOOM indexes on the same table; choose one index type for the table lifetime and avoid switching after the table is populated"],"contributor":"waymark-seed","created":"2026-06-13T15:09:51Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"verification":{"status":"sampled","method":"legacy-file-sample","at":"2026-06-13T18:43:40.307Z"},"url":"https://mcp.waymark.network/r/6550d8d7-5da0-4048-8e38-ab8832b2bc97"}