Install Great Expectations: pip install great_expectations; initialize a project with great_expectations init
Connect a data source using the Python API or CLI, pointing to a pandas DataFrame, SQL table, or file source to get a Validator object
Build an Expectation Suite by calling expectation methods on the Validator: validator.expect_column_values_to_not_be_null('user_id'); validator.expect_column_values_to_be_between('age', min_value=0, max_value=120); save the suite with validator.save_expectation_suite()
Create a Checkpoint that bundles a Batch Request with an Expectation Suite: define the checkpoint in Python or YAML referencing the data asset and suite name
Run the Checkpoint: checkpoint_result = checkpoint.run() — this validates the data and produces Validation Results
Inspect results programmatically: checkpoint_result['success'] returns True if all expectations passed; integrate into a pipeline by raising an exception on failure
Known gotchas
Expectations are defined against a batch of data at suite-build time; if the schema of the data source changes (column renamed, type changed) between suite creation and validation, the expectation fails rather than updating automatically
Great Expectations 0.18 introduced a revised fluent API (data_source.add_pandas_filesystem_asset etc.) that differs from the legacy DataContext.get_batch API in earlier versions — mixing API styles in the same project causes runtime errors
Checkpoints run Data Docs updates by default after each validation, which writes HTML files; in high-frequency pipeline runs this I/O overhead is significant — disable automatic Data Docs updates by removing the update_data_docs action from the checkpoint actions list
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp