When Data meets Splunk

Joyatee Datta
4 min readJan 16, 2021

I was introduced to the topic while solving a room in TryHackMe a few days back and hence I decided to write a blog about it. It is a beginner-friendly illustration for people to understand where and how it is used!

Splunk is a software that is used to search, analyze, and visualize machine data. But have you ever thought, what machine data is, or why we need to analyze machine data?

Machine data or machine-generated data is the digital information that is automatically created by the activities of networked devices, including computers, mobile phones, embedded systems, and sensors.

Machine data is becoming increasingly important in the world of business and the reason behind this is the growing number of computers and the usage of IoT devices.

But what makes machine data so important?

Machine data is behind what is making companies like Google, Facebook, Amazon, and almost all the software companies booming rich. Did you ever see ads popping up on websites on the things you wanted to buy from Amazon or had a discussion with your friends? This is only possible because some people from some company is tracking your device information that is getting shared or searched on the Internet.

This is not all, machine data can keep a track of your health, the kind of food you eat, the places you like, and even your favorite mode of transport.

Machine data has immense use in cybersecurity as well. Properly handled and analyzed machine data can be used to create alerts for security issues, system failures, etc, and help engineers to improve the overall functionality of a system.

But this not as easy as it sounds!!

The data generated from different sources are unstructured, hence difficult to analyze. In this case, we need a tool that can understand machine data and help us solve an innumerable set of problems. So this is where Splunk comes into play. Splunk is a tool in which the machine data is processed to extract them in a human-readable form.

This is how machine data can look like

You can understand the power and the versatility of Splunk in mainly 2 different ways: one in the data centers, and another in the marketing and ads sector.

The primary components of Splunk are:

  1. Forwarder
  2. Indexer
  3. Search Head

Forwarder: Forwarder is essentially used for collecting the data and forwarding it to the Splunk indexes.

Indexers: Indexers helps to store and index the data.

Search Head: We cannot visualize the data from indexers, so we have Search Heads, which provides us a GUI for visualizing and analyzing the data.

A small example can make it easy to understand how Splunk works.

Let’s assume we have some data that is stored in the machine which can be in the form of logs, databases, views, or API clauses. On the other side, we have our users who act as information dividers from various sources. So any business operations need proper drafts, graphs, images, pivots, dashboards for complete analysis and make a better decision for the business.

Indexes or a table of databases act as a bridge between these two sides Data and Users. This is how data is stored in the machines. Once you have stored the data, you can extract the data using Search Processing Language or SPL. All the data in the machine are stored in the form of events. Once you retrieve the data from the machine you can use it according to your needs. So Splunk not only allows easy implementation but also provides services like data indexing, retrieval, investigation, searching, and analyzing systems, helping to identify failures, customer needs, etc.

Splunk data is a visualization

Thank you for reading so far. I hope by now you have a general overview of Splunk.

--

--

Joyatee Datta

Computer Science Engineering | IEM Kolkata | Networking Researcher & Security Nerd