Using Splunk for Data Analysis

Splunk is an enterprise platform to analyze and monitor a wide variety of data like application logs, web server logs, clickstream data, message queues, OS system metrics, sensor data, syslog, Windows events, and web proxy logs in many supported formats. Splunk provides a simple but powerful interface to quickly get insight out of the contextual data. In this post, I will showcase the power of data exploration using Splunk.


To analyze the data, it must first be loaded into Splunk. I have downloaded a sample of Apache web server logs from The log shows events that are time-stamped for the previous 7 days.

To start, upload the Apache logs into Splunk as shown below:

Upload data into Splunk

Upload data into Splunk

Add data into Splunk

Add data into Splunk



Follow the wizard steps. This will provide you with the search/query screen where you can do a detailed analysis over the data.

Here are some of the patterns that I derived out of the data:

1. Overall traffic patter: The overall pattern of traffic to the website is generated by default.

Overall Traffic Pattern


The pattern is for multiple days, but you can choose single day pattern from “date time range.”


You can explore queries on more fields by clicking the “All Fields” link on the left.


Multiple source files can be consolidated to do a comprehensive analysis. Upload a new log file and use a similar operation as shown below:


2. Specific section (category) access pattern: Splunk will get details for individual line items from the input file. For example, Splunk indexed the CategoryId from individual URLs in the file, where CategoryId was a query parameter. The following example demonstrates the traffic pattern for the individual category for each day:


3. Referring sites pattern: Patterns for thesite referring to the website.


  • Error page pattern: Pattern for pages resulting in errors. splunk6

  • HTTP Errors (day-wise breakup).


  • Pages/actions errors by each day pattern


Here is a column chart representation of the errors per day, per page section:


Here is a pie chart representation for a single day:


In this blog post, I’ve touched just the tip of the iceberg; the possibilities with Splunk are immense.

If you have any questions or queries, please leave a comment below. I highly appreciate your feedback!

Manoj Bisht

Manoj Bisht

Senior Architect

Manoj Bisht is the Senior Architect at 3Pillar Global, working out of our office in Noida, India. He has expertise in building and working with high performance team delivering cutting edge enterprise products. He is also a keen researcher and dive deeps into trending technologies. His current areas of interest are data science, cloud services and micro service/ serverless design and architecture. He loves to spend his spare time playing games and also likes traveling to new places with family and friends.

Leave a Reply

Related Posts

The 3 Keys to Building Products That Drive Retention –... I had the privilege of being invited to speak at the Wearable Technology Show in Santa Clara this week, where I gave a bit of a reprisal of a talk I d...
High Availability and Automatic Failover in Hadoop Hadoop in Brief Hadoop is one of the most popular sets of big data processing technologies/frameworks in use today. From Adobe and eBay to Facebook a...
3Pillar CEO David DeWolf Quoted in Enterprise Mobility Excha... David DeWolf, Founder and CEO of 3Pillar Global, was recently quoted in a report by Enterprise Mobility Exchange on the necessity of understanding and...
How the Right Tech Stack Fuels Innovation – The Innova... On this episode of The Innovation Engine podcast, we take a look at how choosing the right tech stack can fuel innovation in your company. We'll talk ...
The Road to AWS re:Invent 2018 – Weekly Predictions, P... For the last two weeks, I’ve been making predictions of what might be announced at AWS’ upcoming re:Invent conference. In week 1, I made some guesses ...