There are a number of factors that affect how many Splunk indexers you will need, but starting with a model system with typical usage levels, the short answer is 100 gigabytes of raw logs per day per indexer. In the vast majority of cases, the disk is the performance bottleneck, except in the case of very slow processors.
The measurements mentioned next assume that you will spread the events across your indexers evenly, using the autoLB feature of the Splunk forwarder. We will talk more about this in indexer load balancing.
The model system looks like this:
- 8 gigabytes of RAM.
- If more memory is available, the operating system will use whatever Splunk does not use for the disk cache.
- Eight fast physical processors. On a busy indexer, two cores will probably be busy most of the time, handling indexing tasks. It is worth noting the following:
- More processors won't hurt but will probably not make much of a difference to an indexer as the disks holding the indexes will probably not keep up with the increased search load. More indexers, each with its own disks, will have more impact.
- Virtualized slices of cores or oversubscribed virtual hosts do not work well, as the processor is actually used heavily during search, mostly decompressing the raw data.
- Slow cores designed for highly threaded applications do not work well. For instance, you should avoid older Sun SPARC processors or slices of cores on AIX boxes.
- Disks performing 800 random IOPS (input/output operations per second). This is the value considered fast by Splunk engineering. Query your favorite search engine for Splunk bonnie++ for discussions on how to measure this value. The most important thing to remember when testing your disks is that you must test enough data to defeat disk cache. Remember, if you are using shared disks, that the indexers will share the available IOPS.
- No more than four concurrent searches. Please note the following:
- Most queries are finished very quickly
- This count includes interactive queries and saved searches
- Summary indexes and saved searches can be used to reduce the workload of common queries
- Summary queries are simply saved searches
To test your concurrency on an existing installation, try this query:
index=_audit search_id action=search | transaction maxpause=1h search_id | concurrency duration=duration | timechart span="1h" avg(concurrency) max(concurrency)
A formula for a rough estimate (assuming eight fast processors and 8 gigabytes of RAM per indexer) might look like this:
indexers needed = [your IOPs] / 800 * [gigs of raw logs produced per day] / 100 * [average concurrent queries] / 4
The behavior of your systems, network, and users make it impossible to reliably predict performance without testing. These numbers are a rough estimate at best.
Let's say you work for a mid-sized company producing about 80 gigabytes of logs per day. You have some very active users, so you might expect four concurrent queries on an average. You have good disks, which bonnie++ has shown to pull a sustained 950 IOPS. You are also running some fairly heavy summary indexing queries against your web logs, and you expect at least one to be running pretty much all the time. This gives us the following output:
950/800 IOPS * 80/100 gigs * (1 concurrent summary query + 4 concurrent user queries) / 4
= 1.1875 indexers
You cannot really deploy 1.1875 indexers, so your choices are either to start with one indexer and see how it performs or to go ahead and start with two indexers.
My advice would be to start with two indexers, if possible. This gives you some fault tolerance, and installations tend to grow quickly as more data sources are discovered throughout the company. Ideally, when crossing the 100-gigabyte mark, it may make sense to start with three indexers and spread the disks across them. The extra capacity gives you the ability to take one indexer down and still have enough capacity to cover the normal load. See the discussion in the Planning redundancy section.
If we increase the number of average concurrent queries, increase the amount of data indexed per day, or decrease our IOPS, the number of indexers needed should scale more or less linearly.
If we scale up a bit more, say 120 gigabytes a day, 5 concurrent queries, and two summary queries running on an average, we grow as follows:
950/800 IOPS * 120/100 gigs * (2 concurrent summary query + 5 concurrent user queries) / 4
= 2.5 indexers
Three indexers would cover this load, but if one indexer is down, we will struggle to keep up with the data from the forwarders. Ideally, in this case, we should have four or more indexers.