Define retention schedules by document category (e.g., contracts: 7 years post-expiry, HR records: varies by jurisdiction, financial records: 7 years) in a retention policy registry; have the policy reviewed and approved by legal counsel.
Tag every document at ingestion with a document_category, creation_date, and computed retain_until date derived from the retention schedule; store these as metadata alongside the document.
Run a daily scheduled job to identify documents where retain_until < today and legal_hold_active = false; generate a deletion candidate list for review.
For automated deletion, require a second approval step (human or rule-based based on document sensitivity tier) before permanent deletion; log each deletion with document ID, hash of content at deletion time, deletion timestamp, and authorizing actor.
Check each deletion candidate against all active legal holds before deletion; documents under hold must never be deleted regardless of retention schedule expiry.
Retain deletion logs permanently (they are evidence of compliant destruction); store them separately from the deleted documents themselves.
Known gotchas
Jurisdiction-specific retention minimums vary significantly and change over time (GDPR, CCPA, SOX, HIPAA each impose different schedules for different data types); a single global retention schedule will either over-retain or under-retain — model retention by jurisdiction and data category.
Automated deletion without legal hold cross-checks is a major litigation risk; always implement the hold-check as a hard gate, not a soft warning.
Backup and disaster recovery copies of deleted documents must also be purged within the policy window; many retention programs fail because backups are excluded from deletion scope.
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp