Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
May 25, 2017
prev slideNext slide

Apache Metron Insight #1: Why real-time enrichment matters

Welcome to our blog series on Big Data Cybersecurity, where we will share key insights on the how and why Apache Metron is designed to address real-world issues of security operations personnel.

Our first topic is about real-time enrichment.

What does enrichment mean?

To best understand what real-time enrichment is about, it is important to consider the context, the position the security analyst is in, and what he or she needs to do their job. Consider the primary (and often daunting task) of triaging alerts. As mentioned in previous blog posts, triaging alerts is a never-ending game looking for a needle in a haystack to determine which ones are most relevant, and which ones are inconsequential.

When analysts get an alert, their first task is to try and figure out what it means in context of the rest of the system, network and overall environment. This means enriching the information available to the analyst, to paint the picture surrounding the data. The context is absolutely critical to understanding if an alert is important or inconsequential. Yet determining context can be a very time consuming process which can involve a mix of:
filing tickets with other teams who may need to answer questions about infrastructure looking up information from public web sources,  cross referencing against company directories of assets and employees, and  ultimately copy-pasting sensitive data from one tool to another. Not only is this extremely inefficient, it’s also a recipe for getting the data wrong.

What about retrospective (after the fact) enrichment?

Take for example a cloud architecture. Elasticity, flexibility and dynamic scaling are some of the reasons why companies like cloud. This means your whole environment can change fast, the allocation of IP addresses is short lived and IP addresses are recycled often even outside of your organization. That means by the time you get to analyzing the data, the relevant IP address information you need has disappeared, or even worse, it has morphed into something else – you may be seeing the same exact numbers of an IP address, but it has now been re-assigned to a completely different entity on the network, and now means something completely different

In such an environment, retrospective enrichment of alerts is next to useless. An event may happen, but even in a well-staffed SOC it could be an hour or two before an analyst can look at it. When they get around to running DNS and whois queries to verify the alert, the world has moved on. The answers they get now have little relevance to the way the world was when the alert was generated.

Real-time enrichment is required to evaluate impact of cyberthreats
Real-time enrichment is required to evaluate impact of cyberthreats

Why Real-Time Enrichment

In Apache Metron, data is enriched as it is ingested, which means the analyst is getting an alert with all the lookup busy work already done, but also crucially a real representation of the context the alert occurred in. So, if a proxy server finds suspicious traffic from an IP address, we can look at the real-time context generated from Active Directory and DHCP logs to find out what was really going on at the time, even if those addresses and allocation have all changed by the time our analyst gets around to investigating. This does take up a little more disk, but we’re trading disk space for accuracy, and more importantly SOC personnel’s time.  (BTW, for more information on the life in a day of a security analyst, check out this blog post on Why Context Matters and How we Find it, which explains what it really takes to triage an alert)

You can find recordings and slides to related DataWorks Summit San Jose sessions below:

Leave a Reply

Your email address will not be published. Required fields are marked *