vRealize Log Insight: Creating your own content pack for field extraction

vmware_logo Content Packs are plugins that allow you you to create pre-packaged knowledge about specific event types.

For example, you can create a content pack that knows how to extract fields from one of your custom log sources.  Beyond extracted fields, you can also add saved queries, aggregations, alerts, dashboards, and visualizations.

Incoming Events from Agent

First, let’s examine our sample log file on the agent side, in a file named /tmp/test.log.

2016-07-14 22:04:13.233 INFO  com.my.myTest      - [  150] 200

You can see that it has a timestamp followed by an uppercase log level, then multiple spaces followed by the Java classname, then again multiple padding spaces followed by a hyphen.  It ends with what looks like a process or thread id, and then finally an http return code.

We will modify liagent.ini with the following section to start sending events to the server:

[filelog|test]
directory=/tmp
include=test.log
tags={ "subtype":"test" }

The only thing not completely vanilla here is the tags key.  This is used to send extra metadata about this event, and is the way we will identify that we want to apply our custom field extraction logic to an incoming event.

Boiler Plate for Content Pack file

A Content Pack is just a single text file with a .vlcp extension, whose text is in json format.

{ 
  "name":"test", 
  "namespace":"my.test", 
  "contentPackId":"my.test", 
  "framework":"#9c4", 
  "version":"1.0", 
  "author":"me", 
  "url":"https://fabianlee.org", 
  "contentVersion":"1.0",   
  "info":"<span>Loads the test log</span>", 
  "queries":[ ], 
  "alerts":[ ], 
  "dashboardSections":[ ], 
  "extractedFields":[ ]
}

The section we are particularly interested in is ‘extractedFields’.  This is a json array element where we can add our individual field definitions.  Each field definition will follow this form:

{
 "displayName":"test.myfield1",
 "internalName":"test.myfield1",
 "preContext":"",
 "postContext":"",
 "regexValue":"\\S+",
 "constraints":"{\"filters\":[{\"internalName\":\"subtype\",\"displayName\":\"subtype\",\"operator\":\"CONTAINS\",\"value\":\"test\",\"fieldType\":\"STRING\",\"isExtracted\":true,\"hidden\":false}]}",
 "info":null
 }

You decide the field name, then you add any preContext and postContext regex that can assist in narrowing down the location of the field, then the regexValue which captures the value you want.  The constraints section is evaluating whether the incoming subtype=test, which directly relates to our earlier modification on the client sending the subtype tag.

When deciding upon preContext, postContext, and regexValue it is important to note that each field independently evaluates the regex against the entire event string.  In other words, the order in which you place your field definitions does not matter – it will not look for the second field definition starting at the end of where the first field was found.  You have to define the absolute placement of each field.

Regex definitions

Outside the world of content packs, let’s look at the regex we will need to parse out incoming event lines.  Here is an example input line:

2016-07-14 22:04:13.233 INFO  com.my.myTest      - [  150] 200

Using an online regex tester, you can build the regex below to match the line:

^\S+ \S+ \S+\s+\S+\s+\-\s\[\s+\d+\]\s+\d+

This makes our field extraction regex look like this:

  • level, STRING
    • preContext = ^\S+ \S+
    • regexValue = \S+
  • classname, STRING
    • preContext = ^\S+ \S+ \S+\s+
    • regexValue = \S+
  • threadname, NUMBER
    • preContext = ^\S+ \S+ \S+\s+\S+\s+\-\s\[\s+
    • regexValue = \d+
  • httpResponseCode, NUMBER
    • preContext = ^\S+ \S+ \S+\s+\S+\s+\-\s\[\s+\d+\]\s+
    • regexValue = \d+

I find it very uncommon that my log files have static sign posts that can mark preContext or postContext values effectively, so per my previous discussion about each field being independently evaluated with no regard to order, each preContext above effectively is a build up of each of the previous fields.

Custom Content Pack

Now we just need to plug this field information into our custom content pack, test.vlcp.  Let’s take the first field as an example:

{ 
    "displayName":"test.level", 
    "internalName":"test.level", 
    "preContext":"^\\S+ \\S+ ", 
    "postContext":"", 
    "regexValue":"\\S+", 
    "constraints":"{\"filters\":[{\"internalName\":\"subtype\",\"displayName\":\"subtype\",\"operator\":\"CONTAINS\",\"value\":\"test\",\"fieldType\":\"STRING\",\"isExtracted\":true,\"hidden\":false}]}", 
    "info":null 
  }

The one thing you might notice is that in the preContext and regexValue, we have had to escape each backslash character, so every backslash becomes a double backslash.  If we did not do this, the string would be malformed and it would not be valid json.

I would strongly suggest you run your final json through a lint program to verify the syntax before you attempting an upload into Log Insight.

Here is a link to the full test.vlcp, which contains all the field definitions.

Log Insight Web Interface

The last step is to upload the test.vlcp into LogInsight.  If you click on the top-right icon beside your login name, and select “Content Packs” then you can see all the content packs currently in use.

Now at the bottom left, there is a button “Import Content Pack”, press it and install the content pack as visible to all users, and upload the .vlcp file.

Now when you get incoming events from /tmp/test.log, it should look something like below.  The incoming event will automatically extract the fields and its type so you can run proper searches on subtype, test.level, test.classname, test.threadname, and test.httpResponseCode.

loginsight_fieldtable

 

 

 

 

 

 

REFERENCE

Log Insight 3.3 docs, Creating Content Packs
https://pubs.vmware.com/log-insight-33/topic/com.vmware.log-insight.user.doc/GUID-D7163092-0617-48D5-A41F-8020AD99DB8F.html

Log Insight and timstamps
http://sflanders.net/2014/07/16/time-log-insight-events-timestamps-queries/

Latest parsers
http://sflanders.net/2015/10/29/log-insight-3-0-agents-timestamp-parser/

managing fields
http://sflanders.net/2013/11/07/managing-fields-log-insight/

Content Pack marketplace
https://solutionexchange.vmware.com/store