Set up an email ingestion pipeline using IMAP, a dedicated forwarding address, or a Gmail/Outlook API integration to receive forwarded confirmation emails from travelers.
Apply an LLM or trained extraction model to each email body to extract structured fields: confirmation number, carrier/supplier name, origin, destination, departure datetime, arrival datetime, passenger name, and booking status.
Normalize extracted datetimes to UTC and validate them against each other (e.g., departure must precede arrival); flag low-confidence extractions for human review.
Deduplicate by matching confirmation numbers against existing itinerary records; update existing segments rather than creating duplicates when re-confirmations arrive.
Enrich the parsed segments with IATA codes and timezone data from a reference dataset to ensure consistency across bookings parsed from different email formats.
Store the raw email alongside the parsed data for audit; extracted itinerary items should link back to their source email for dispute resolution.
Known gotchas
Confirmation emails have no standard format — airline email templates change without notice, breaking regex-based parsers; LLM-based extraction degrades more gracefully but still requires confidence scoring.
HTML emails with multi-part MIME require careful text extraction; table-based layouts often produce garbled text when stripped of HTML — use a proper HTML-to-text converter, not naive regex.
Some confirmations arrive with local times and no timezone offset; always cross-reference with airport IANA timezones rather than assuming UTC or the sender's timezone.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp