Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
May 04, 2017
prev slideNext slide

20 Questions with Big Data Cybersecurity Experts on Apache Metron: Webinar Recap

Last week, we hosted a webinar: Combating Phishing Attacks: How Big Data Helps Detect Impersonators  where our audience confirmed that is really can take months, or even a year to investigate the repercussions of a breach such as a phishing attack. Due to the complex and dynamic nature of modern attack vectors, we discussed how much effort is involved in assessing the risk and damage that hackers can inflict upon enterprises today. More info on how to leverage big data and machine learning to detect hackers and impersonators in this blog.


During the webinar,  we also covered the recently announced top level Apache project – Apache Metron. Apache Metron is an open source big data cyber security analytics platform supporting real time ingest and analytics to discover information security threats and build out a high value security data lake. Apache Metron helps security operations teams be more efficient by reducing the amount of “DIY” big data and data science tooling necessary to detect threats in real time.

Apache Metron big data cybersecurity platform

There was plenty of discussion so we’ve done our best to answer the questions below. If you have more questions, anytime, we encourage you to check out the Cybersecurity track of Hortonworks Community Connection where an entire community of folks are monitoring and responding to questions. For those who may have missed the session you can check out the on-demand webinar  and slideshare.


  • Can Apache Metron take in log data from applications (finance apps) and figure out anomalies too?


  • How about encrypted attachments, typically stock statements etc? Do these get handled too?

Handling encrypted streams can be a tricky, though not impossible problem. If your network has for example SSL interception, based on re-encryption using keys under your control, then you can do things like content inspection on encrypted mail. Of course if you have the keys used to encrypt, the decryption can be handled through Apache NiFi processors on the ingest path for instance.

  • In a case of telcos, where GPS co-ordinates are also important, can Metron ingest such type of data sources directly and provide us analytics in terms of geography also?

GPS is a very useful complement to traditional network and GeoIP data. This is certainly the sort of telemetry that would work well with the streaming enrichment capability to provide context to, for example NetFlow, Proxy or other application log telemetry.

  • Do you have an API or standardized input/query method to integrate other enrichment sources such as geolocation databases.

There are enrichment loaders for a wide range of data sources and the ability to transform inbound enrichments with a simple DSL called Stellar. We also have native support for GeoIP enrichment using the MaxMind binary API for speed.

  • Do you support other input sources like Bro or Pyshark?

We provide a parser for Bro data, and a plugin for Bro to post data directly to Kafka for high-performance Bro ingest.

  • Does Hortonworks provide professional services to help deploy and manage Metron on an ongoing basis?

Absolutely, Hortonworks can provide services and support for Metron and the underlying platforms.

  • Can you provide more detail regarding how to search for PCAP data and how Metron archives the information?

Metron provides a high speed route to load PCAP data into Sequence files in HDFS, to ensure split-ability and large scale processing. It then provides a range of means to query and process the raw PCAPs. Metron provides jobs to query PCAP by basic headers as well as mechanisms to do deep pattern searches over large scale PCAP.

  • Is the Metron Engine based on SPARK for real time processing? Can it apply Spark SQL and MLLIB for Machine Learning? Is Python supported for customization?

The core Metron engine is built on Storm to provide low latency real-time task parallelism. Spark excels at data-parallel tasks. Metron also makes extensive use of Spark for building machine learning models with a variety of libraries. Many of the models have been built using pyspark for example.

  • What were the companies cited  in the articles about leveraging open source again?

Telstra  and Capital One.  (Related article here)

  • What plans do you have to reduce complexity of deployment and long term support of a full blown Hadoop stack?

Much of the install and management is simplified by Ambari through MPacks which are used to install the Metron application. We are constantly working to provide simpler means to operate the platform. The SmartSense offering from Hortonworks also provides advice and tuning suggestions based on real use of the cluster, making operations easier.

  • For the presentation layer do we again need Tableau type of software ?

Metron provides a number of visualisation options out of the box including kibana dashboards, and some example Zeppelin dashboards for typical use cases around data sources like NetFlow

  • Do you support integration with Graph databases or the Spark built in graph functionality? I think Graph database and Graph analysis can help for sporadic emails and analysis of dense sources and destinations.

We currently have some roadmap items around integration of graph databases and an ontology mapper.

  • Any input on Metron UI availability?

Right now (April 2017) we have a management UI within the platform which provides access to configure parsers, enrichments and transformations. The management UI also provides an interface to edit triage rules and tune scores to prioritise the output for analysts. That output is mainly in the form of Kibana and Zeppelin dashboards at present. A complete investigator focused UI experience is in the works.

  • How are Apache and Hortonworks related and integrated from a business standpoint? I am always confused with Apache’s huge set of offerings in the Big Data front. When I think Apache as a open source governance enterprise. Please correct my misunderstandings?

Hortonworks and Apache Software Foundation are completely separate entities. Hortonworks provides distributions of code developed in Apache projects. We also provide support subscriptions and services around those distributions, something we can do by employing many of the Apache committers on the projects. We commit our contributions back to the Apache Open Source.

  • Where do I find more info about Apache Metron?

There is a brief overview here.  You can also join the community at

  • Can we use AWS bigdata and machine learning (SparkM, Amazon Machine learning) solutions to speed up the setup of Metron in the cloud?

We currently provide deployment mechanisms which deploy Metron directly into AWS and Azure as part of the solution offering. Many of our customers choose to run Metron in the cloud. However, our solution is cloud agnostic, so we do not tightly couple it to solutions only available in one cloud, but work across all vendor clouds.

  • Can Metron components be monitored using Ambari?

Metron is installed and managed through Ambari MPacks. The underlying components and metrics are all monitored through Ambari Metrics. We are working to publish more Metron specific metrics to Ambari as well.

  • Does Metron support Zeppelin?

Metron is usually installed on the HDP platform, which has full support for Zeppelin. In fact, we use Zeppelin to produce a number of dashboards. Metron also uses Zeppelin to provide active runbooks for SOC staff, which use the notebook approach to integrate process documentation and live data with visualisation to guide analysts through the investigation process.

  • For any extensions to the model, does it also allow Python / R?

The model as a service component allows you to write and extend models in any language that will run in a YARN container, and provide a REST interface. What this means is that languages such as Python and R are an excellent fit to run models, either directly on their own, or for instance through Spark.

  • Where can I find the recording of the webinar?

The webinar is available on demand here:


Tutu Helper says:

It is really a great and useful article Anna. I’m glad that you shared this helpful info with us. Please keep us informed like this. Thanks a lot for sharing.

coupons says:

Thank you for bringing more information to this topic for me. I’m truly grateful and really impressed. Absolutely this article is incredible.

OneBox HD Movie App says:

Hey this is quite interesting thing for me. keep up the good work and you are awesome. Thanks

TuTu Helper free says:

Great info and 20 big data cyber security questions are too good. Thanks

tutu helper says:

Tutu Helper Free iOS 10/9 – Tutu Helper/TutuApp is a great way to download all the premium VIP apps on your iPhone tutu helper.

download kingroot 5.0.6 says:

King-Root is an app that will assist you in rooting your Android here to download.
download kingroot 5.0.6

dileep kumar says:

is back with a brand new adventure! New levels, new music, new monsters, new everything! Flex your clicky finger as you jump, fly and flip your

geometry dash world says:

Geometry Dash World is an expansion app of Geometry Dash developed and published by RobTop Games…

Pokesniper says:

thanks for sharing a valuable information,, very useful content….

aptoide apk says:

I personally hope to be able to reap some of the benefits of these upcoming innovations especially in the area of health care. Keep up the fascinating work.

james pal says:

Your style is unique in comparison to other people I’ve read stuff from. Thanks for posting when you’ve got the opportunity, Guess I’ll just book mark this site.

Tweak Box says:

We are really glad to read this amazing info.

cinemaboxhd apk 2017 says:

I really add this information to my notes.

TuTuapp apk download says:

I would update this important information in my database for sure. Thanks for the help


While this method works and lets you stream Kodi to Chromecast, it’s not really feasible, considering you can’t lock your phone or take any calls or do anything for that matter on your Android device. Also, in our testing, the video playback was perfectly

sdnjksnfkld says:

kjsadkasjfd asfnkjdsanfh dsfhuiowerhfouiwehbdb w

Read more says:

Great To see this

Well Same way Gbwhatsapp is also Great Android app

download uktvnow apk says:

Thanks for sharing a post. It’s a quality articles. Thank you for sharing.

mod apk says:

Incredible article dude! Thank you, However I am having difficulties with your RSS. I don’t understand the reason why I cannot join it.

Hotmail says:

Thanks for sharing.

Jio4GVoice says:

After going over a handful of the blog articles on your website, I truly appreciate your technique of writing a blog. I bookmarked it to my bookmark webpage list and will be checking back in the near future. Please check out my website as well and let me know what you think.

Hotstar Download For PC says:

Download new apps for pc free download

Vidmate online says:

Vidmate online for pc app free download says:

Very Nice Post

Apk Extractor says: says:

Download PUBG Mobile Official Android Mod Apk Data Obb 600MB High Graphics

Ludo Game says:
Your comment is awaiting moderation.

Ludo is a game which refreshes the childhood days in a modern way. Being a classic board game, it can be played between friends, family & kids via offline and also with online worldwide players by using online modes.

Leave a Reply

Your email address will not be published. Required fields are marked *