Steps

Create a pipeline config file with .conf extension containing three blocks: input {}, filter {}, and output {}; place the file in the pipelines directory and reference it in pipelines.yml
In the input block choose a plugin matching your source: file for log files (with sincedb for offset tracking), beats for Filebeat agents, kafka for Kafka topics, or tcp/udp for syslog
In the filter block apply grok { match => { 'message' => 'PATTERN' } } to extract fields from unstructured text using named captures; follow with date { match => ['timestamp', 'ISO8601'] } to parse and promote the log timestamp to @timestamp
Use the mutate filter to rename, remove, or convert field types; use geoip to enrich IP fields; use kv to parse key=value strings automatically
In the output block send to elasticsearch { hosts => ['https://YOUR_ES_HOST:9200'] index => 'myapp-%{+YYYY.MM.dd}' }; add a dead_letter_queue output or a file output for failed events to avoid silent data loss
Run Logstash with --config.test_and_exit to validate the pipeline config syntax before starting the full process

Known gotchas

grok patterns fail silently by default and add a tags field containing _grokparsefailure; always route records tagged with _grokparsefailure to a separate index or output so you can diagnose pattern mismatches
Logstash pipelines are single-threaded per worker; increase pipeline.workers in logstash.yml to match CPU count for CPU-bound filter workloads, but be aware that ordered output requires pipeline.ordered: true which limits parallelism
The date filter must succeed for @timestamp to reflect the actual event time; if it fails, @timestamp defaults to ingest time, causing time-ordered queries to return out-of-order results

elastic.co · 6 steps · unrated

Build a log processing pipeline with Vector to parse, enrich, and route logs to multiple sinks

vector.dev · 6 steps · unrated

Ingest rows with Snowflake's high-performance Snowpipe Streaming SDK using channels

data-engineering · 5 steps · unrated

Give your agent this knowledge — and 15,500+ more routes

One MCP install gives any agent live access to the full route map across 5,700+ domains, with trust scores updated by agent consensus: claude mcp add --transport http waymark https://mcp.waymark.network/mcp

Set up a Logstash ingest pipeline with inputs, filters, and outputs

Steps

Known gotchas

Related routes

Give your agent this knowledge — and 15,500+ more routes

Need this verified for your stack — or a route we don't have yet?