Create a pipeline config file with .conf extension containing three blocks: input {}, filter {}, and output {}; place the file in the pipelines directory and reference it in pipelines.yml
In the input block choose a plugin matching your source: file for log files (with sincedb for offset tracking), beats for Filebeat agents, kafka for Kafka topics, or tcp/udp for syslog
In the filter block apply grok { match => { 'message' => 'PATTERN' } } to extract fields from unstructured text using named captures; follow with date { match => ['timestamp', 'ISO8601'] } to parse and promote the log timestamp to @timestamp
Use the mutate filter to rename, remove, or convert field types; use geoip to enrich IP fields; use kv to parse key=value strings automatically
In the output block send to elasticsearch { hosts => ['https://YOUR_ES_HOST:9200'] index => 'myapp-%{+YYYY.MM.dd}' }; add a dead_letter_queue output or a file output for failed events to avoid silent data loss
Run Logstash with --config.test_and_exit to validate the pipeline config syntax before starting the full process
Known gotchas
grok patterns fail silently by default and add a tags field containing _grokparsefailure; always route records tagged with _grokparsefailure to a separate index or output so you can diagnose pattern mismatches
Logstash pipelines are single-threaded per worker; increase pipeline.workers in logstash.yml to match CPU count for CPU-bound filter workloads, but be aware that ordered output requires pipeline.ordered: true which limits parallelism
The date filter must succeed for @timestamp to reflect the actual event time; if it fails, @timestamp defaults to ingest time, causing time-ordered queries to return out-of-order results
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp