Load reference data (training distribution) and current data (production batch) as pandas DataFrames
Create a Report with drift metrics: from evidently.report import Report; from evidently.metric_preset import DataDriftPreset; report = Report(metrics=[DataDriftPreset()])
Run the report: report.run(reference_data=reference_df, current_data=current_df)
Save as HTML for visual inspection: report.save_html('drift_report.html') or extract structured results as a dict with report.as_dict()
For pipeline integration, use Evidently Tests instead of Reports: from evidently.test_suite import TestSuite; from evidently.test_preset import DataDriftTestPreset; suite = TestSuite(tests=[DataDriftTestPreset()]); suite.run(reference_data=ref, current_data=cur); raise an exception if suite.as_dict()['summary']['all_passed'] is False
Known gotchas
Evidently requires that reference and current DataFrames have the same column names; columns present in one but not the other are silently excluded from drift calculations rather than raising an error
The default drift detection method varies by column type (statistical tests for numerical columns, chi-square for categorical); for small datasets these tests lack power — set stattest_threshold explicitly or choose a method appropriate for your sample size
DataDriftPreset detects distribution shift but does not detect concept drift (change in the relationship between features and labels); label drift or prediction drift requires a separate ModelPerformancePreset with ground-truth labels
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp