Overriding the source

Some applications will write a log for each session, conversation, or transaction. One problem this introduces is an explosion of source values. The values of the source will end up in $SPLUNK_HOME/var/lib/splunk/*/db/Sources.data—one line per unique value of the source. This file will eventually grow to a huge size, and Splunk will waste a lot of time updating it, causing unexplained pauses. A new setting in indexes.conf, called disableGlobalMetadata, can also eliminate this problem.

To flatten this value, we could use a stanza like this:

[myapp_flatten_source] 
SOURCE_KEY = MetaData:Source 
DEST_KEY = MetaData:Source 
REGEX = (.*session_).*.log 
FORMAT = source::$1x.log 

This would set the value of source to /logs/myapp.session_x.log, which would eliminate our growing source problem. If the value of session is useful, the transform in the Creating a session field from source section could be run before this transform to capture the value. Likewise, a transform could capture the entire value of the source and place it into a different metadata field.

A huge number of logfiles on a filesystem introduces a few problems, including running out of inodes and the memory used by the Splunk process tracking all of the files. As a general rule, a cleanup process should be designed to archive older logs.