When building your logstash filter, you would often like to validate your assumptions on a large sampling of input events without sending all the output to ElasticSearch.
Using Logstash metrics and conditionals, we can easily show:
- How many input events were processed successfully
- How many input events had errors
- An error file containing each event that processed in error
This technique gives you the ability to track your success rate across a large input set, and then do a postmortem review of each event that failed.
I’ll walk you through a Logstash conf file that illustrates this concept.
input { stdin { type => "testevents" } } # input end filter { # expecting a name and age on each line separated by space, e.g.: grok { match => { "message" => "%{DATA:name} %{NUMBER:age}" } } # total event count metrics { meter => [ "total_%{type}" ] add_tag => "metric" } # metrics # failure count if "_grokparsefailure" in [tags] { metrics { meter => [ "problem_%{type}" ] add_tag => "metric" } # metrics } # if grok failure } # filter end output { # write any problems to file for later review if "_grokparsefailure" in [tags] { file { path => "/tmp/errors.log" } } if "metric" in [tags] { # write progress to file for tail in real-time file { path => "/tmp/metrics.log" } }else { # true events can be routed wherever necessary stdout { codec => rubydebug } } } # output end
I created this file at /opt/logstash/testmetrics.conf, so that I can easily start logstash from the console:
cd /opt/logstash bin/logstash -f testmetrics.conf
Once you see the message “Pipeline main started”, then go ahead and type each of the lines below:
moe 52 larry 51 curly
This will produce output like below. You will notice that “moe 52” and “larry 51” both parse properly, but our last line of input where we only typed “curly” without an age produces a grokparsefailure.
Pipeline main started moe 52 { "message" => "moe 52", "@version" => "1", "@timestamp" => "2016-07-18T02:13:46.986Z", "type" => "testevents", "host" => "myhost", "name" => "moe", "age" => "52" } larry 41 { "message" => "larry 41", "@version" => "1", "@timestamp" => "2016-07-18T02:13:50.433Z", "type" => "testevents", "host" => "myhost", "name" => "larry", "age" => "41" } curly { "message" => "curly", "@version" => "1", "@timestamp" => "2016-07-18T02:13:53.161Z", "type" => "testevents", "host" => "myhost", "tags" => [ [0] "_grokparsefailure" ] }
Now look at the output that went to /tmp/errors.log, and you will see the exact message sent for the input event that failed:
{"message":"curly","@version":"1","@timestamp":"2016-07-18T02:13:53.161Z","type":"testevents","host":"myhost","tags":["_grokparsefailure"]}
And now look at the output that went to /tmp/metrics.log, and you can see the total number of events process was “count:3”
{“@version”:”1″,”@timestamp”:”2016-07-18T02:13:58.914Z”,”message”:”myhost”,“total_testevents”:{“count”:3,”rate_1m”:0.38400888292586466,”rate_5m”:0.3966942907643235,”rate_15m”:0.3988919696009794},”tags”:[“metric”]}
While the total number of failures was “count:1”
{“@version”:”1″,”@timestamp”:”2016-07-18T02:13:58.915Z”,”message”:”myhost”,“problem_testevents”:{“count”:1,”rate_1m”:0.2,”rate_5m”:0.2,”rate_15m”:0.2},”tags”:[“metric”]}