Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
August 07, 2017
prev slideNext slide

Apache Metron Streaming Architecture Stellar: “the Excel of Cybersecurity”

The Challenge

Apache Metron, at its core, is a streaming analytics solution.  Yes, it is a streaming analytics solution aimed at detecting and prioritizing cyber threats at your network’s doorstep.  Yes, it’s built atop Hadoop.  Putting the end-game aside, it’s easy to see that the challenges that such a system faces are precisely the challenges faced in any streaming analytics solution.

Put short, the job of any analytics solution, streaming or otherwise, is to provide insights.  Better insights generally follow from more context.  By example, consider a logon event, how might we decide whether this event is by an attacker or by a trusted user?  Well, if we knew more about the behavior of users across the system or, perhaps, if we knew details about the source IP, such as country of origin, we might build up an evidentiary store we might be able to provide that evidentiary context to our downstream analysts.  Each of these contextual enrichments can be considered to build up a risk profile and decide, ultimately, whether this event is legitimate or malicious in an automated manner.  Indeed, the following are almost axiomatic in systems such as these

  • More context yields better, more accurate insights
  • Providing more context involves some form of enrichment

Furthermore, because we are also operating in a streaming environment besides, we need to achieve the enrichments above just-in-time and without downtime.  Indeed, the challenge is to provide the ability to provide a host of enrichments that bring forth relevant context to data as it streams by without impacting the throughput of the system.  Furthermore, we must acknowledge that we will not think of every type of enrichment.  As someone who has spoken to many customers and potential customers, my trust in my ability to completely predict user requests is low, at best.

It was clear that we needed a solution that had a few characteristics:

  • It must be flexible
  • It should be decoupled from streaming environment
  • It should be simple to understand for non-programmers

Flexibility

The bitter truth for practitioners of our field is that dirty and malformed data is the rule rather than the exception.  Any system that has a hope of being acceptable must be capable of doing some scrubbing and transformation on-the-fly.  Consider the use-case that we want to determine if a top-level-domain of a hostname is in a blacklist.  We will adjust the risk profile accordingly.  The problem, of course, is that top level domains need to be extracted and maybe the hostnames come in potentially untrimmed or corrupted in some normal manner.  The options are our disposal is to push all of the data preprocessing and scrubbing into the whitelist functionality or to allow for the user to scrub the dirty input data on the way in.  I strongly prefer the latter.  Having small, composable units of work is a mainstay of a productive and workable system.  Forcing enrichment authors to predict the complete range of intermediate transformation required to sanitize their own inputs is asking for trouble.

Decoupled

Currently, Metron would best be described as a kappa architecture.  That being said, as Metron grows, we will want to execute the streamed enrichments in batch.  Furthermore, it turns out enrichments are addictive.  As we add more capabilities and subsystems, such as bulk import into HBase, it became quickly apparent that we might want to transform and enrich data as it goes into our enrichment store!

Simple

What is or is not simple is dependent very much upon the beholder.  I have found that role and exposure builds up a cross-hatched, complex tapestry of experiences that distinguish between  “obvious”, “non-obvious” and everything in between.  Therefore, it is important to consider your audience when making a decision about how to provide such a complex bit of functionality to our users.

If you are a software engineer, at this point you may be screaming at the screen that we should just use a programming language.  If you are a data analyst, you are likely screaming that the obvious choice is SQL.  Metron is a system for the security analyst and they are neither of these roles exactly.  Indeed, we needed something not quite as complex and something slightly different from SQL.  On the other side, though, it was clear that embedding a general purpose programming language, that wasn’t quite the right fit either.  We found that, by far and wide, we needed something closer to single line transformations that you could compose.  

Considering existing solutions in the wild, I’d say the most relevant and strong motivating example is Microsoft Excel functions.  Excel gives you the ability to compose simple functions to transform the values of cells based on the context of a spreadsheet.  

Stellar

Using the constraints and motivations we constructed Stellar as a scripting environment to have the following capabilities.  A more complete discussion can be found here, but the highlights include:

  • Provide commonly used simple functions (i.e. TO_UPPER, TRIM, IN_SUBNET etc)
  • Provide functions to interact with the system at large (i.e. data in HBase, models deployed via Model as a Service)
  • Provide the ability to do simple conditional statements
  • Provide the map, list, numeric and string primitives
  • Can refer to variables.  In metron, the fields an individual message form variables
  • Provide the ability to compose those simple solutions into more complex solutions
  • Provide the ability for users to define new functions
  • Provide a REPL (read, evaluate, print loop) to allow users to test Stellar functions

Some aspects of Stellar are like programming environments (e.g. the REPL) and and some are very much not.  It’s worth considering the limitations that we have chosen to include for simplicity:

  • No loops; prefer functional primitives such as MAP, REDUCE and FILTER along with lambda functions as a last resort to get users out of a bind
  • No explicit types
  • No ability to create Stellar functions in Stellar; Stellar functions are implemented in Java

Examples

Let’s consider a situation where we have a message with field ip_src_addr and we want to determine if the src address is one of a few subnet ranges and we want to store that in a variable called is_local:

is_local := IN_SUBNET( ip_src_addr, '192.168.0.0/16', '192.169.0.0/16')

 

Now, let’s consider a situation where we want to determine if the top level domain of a domain name, stored in a field called domain, is from a specific set of whitelisted TLDs:

is_government := DOMAIN_TO_TLD(domain) in [ 'mil', 'gov' ]

 

Let’s assume further that the data coming in is known to be spotty with possible spaces and a dot at the end periodically due to a known upstream data ingest mistake.  We can do that with 3 Stellar statements, the first two sanitizing the domain field and the final doing the whitelist check:

sanitized_domain := TRIM(domain)

sanitized_domain := if ENDS_WITH(sanitized_domain, '.') then CHOP(sanitized_domain) else sanitized_domain

is_government := DOMAIN_TO_TLD( sanitized_domain ) in [ 'mil', 'gov' ]

 

Now, let’s consider a situation where we have a blacklist of known malicious domains.  We have used the Metron data importer (also here) to store this data in HBase under the enrichment type ‘malicious_domains’.  As data streams by, we’ll want to indicate whether a domain is malicious or not.  Further, as before, we still have some ingestion cruft to adjust:

sanitized_domain := TRIM(domain)

sanitized_domain := if ENDS_WITH(sanitized_domain, '.') then CHOP(sanitized_domain) else sanitized_domain

in_blacklist := ENRICHMENT_EXISTS('malicious_domains', sanitized_domains, 'enrichments', 't')

Stellar in Metron

Within metron, we use Stellar every places that we foresee the need of some degree of user modification or transformation integrate with Stellar directly.  Specifically you can use Stellar to:

  • Filter data in a parser to only include messages that match a condition
  • Transform a field or enrich a field as it is parsed (i.e. sanitize the domain field above) as a field transformation
  • Provide a mutually independent set of Stellar statements to enrich data in parallel as a Stellar enrichment
  • Create threat triage rules to give a sortable score to messages which are an alert.  This will help downstream security analysts prioritize their list of alerts to consider.
  • Transform data imported into HBase via our flat file loading mechanism.
  • Specify packets to include in a query of pcaps from across the packet captures stored in HDFS
  • From the REPL, you can add and modify configs in Zookeeper directly from Stellar

Further, for those capabilities that involve enriching or transforming data in the stream, Stellar statements are stored in zookeeper and thus require no restart of topologies to uptake the new enrichment or new field transformation, just an update of zookeeper from either the web interface, the CLI zookeeper config management tools or via the REPL.

Stellar is the prime mechanism that we use for interacting with various subsystems of Metron.  

  • Interacting with data stored in HBase as an enrichment (see example above).  
  • Interacting with the profiler via Stellar to enable querying historical context (i.e. statistical snapshots of user past behavior) in-stream
  • Interacting with machine learning models deployed via Model as a Service to integrate the output of machine learning as an enrichment.

As you can see, Stellar provides the glue that gives a consistent user experience for interacting with the various subsystems of Metron.

Custom Stellar Functions

Within Metron, we strive to enable as many of the use-cases as we can possibly foresee by the default sets of functions, but we understand that we will not be able to anticipate every edge case of enrichment or transformation function.  Thus, we want to make it extremely simple to add new functions for specific needs.  You can find complete instructions here, but the general approach is to implement your Stellar function in java by

  • Implementing the StellarFunction interface
  • Annotating the class with the @Stellar annotation which provides the documentation and name for the function
  • Uploading the jar containing your function(s) to HDFS to a configurable location
  • Restarting the relevant topologies to refresh the cache of Stellar functions available

Conclusion

On the whole, I believe that Stellar has scratched an important itch within Metron.  It provides a consistent glue to fit together the various subsystems of Metron which provide their own unique capabilities into a whole solution.  Walking the tight-rope of providing power and also not overwhelming possibly non-technical users with complexity is an interesting one.

I think that if we had adopted a general purpose programming language, the experience would have been needlessly complex for the tasks that our users need.  On the other end, with hard-coded enrichments and transformations, we’d be in a perpetual arms race to provide more and more esoteric enrichments as part of the main project.  Creating a simple language that we control ensures that we can focus on the capabilities that are most generally useful while also That is not to say that we do not have more ahead.  One downside of having an adaptable language like Stellar is that it can be challenging to provide a useful graphical user interface abstraction.  I have high hopes that we will adopt a solution similar to Blockly to make the creation of Stellar statements even more visual and less scary.

We are at the beginning of the path of this technology and, frankly, I like how the road ahead looks.  If you want to know more about Stellar such as the language capabilities or the core functions, then you can find all of that detailed in the Metron documentation.

To watch Casey’s presentation from DataWorks Summit click here: controlling the complexity dramatically.

Slides here:

Comments

  • Leave a Reply

    Your email address will not be published. Required fields are marked *