Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
January 09, 2013
prev slideNext slide

Big Data Security Part Three: PacketPig Finding Zero Day Attacks


This is part three of a Big Data Security blog series. You can read the previous two posts here: Part One / Part Two.

When Russell Jurney and I first teamed up to write these posts we wanted to do something that no one had done before to demonstrate the power of Big Data, the simplicity of Pig and the kind of Big Data Security Analytics we perform at Packetloop. Packetpig was modified to support Amazon’s Elastic Map Reduce (EMR) so that we could process a 600GB set of full packet captures. All that we needed was a canonical Zero Day attack to analyse. We were in luck!

In August 2012 a vulnerability in Oracle JRE 1.7 created huge publicity when it was disclosed that a number of Zero Day attacks had been report to Oracle in April but had still not been addressed in late August 2012. To make matters worse Oracle’s scheduled patch for JRE was months away (October 16). This position subsequently changed and a number of out-of-band patches for JRE were released for what became known as CVE-2012-4681 on the 30th of August.

The vulnerability exposed around 1 Billion systems to exploitation and the exploit was 100% effective on Windows, Mac OSX and Linux. A number of security researchers were already seeing the exploit in the wild as it was incorporated into exploit packs for the delivery of malware.

What is a Zero Day?

Put simply it’s any vulnerability that can be exploited without an available mitigation. The mitigation most people measure Zero Days by is a patch from the software vendor (in this case Oracle).

If we look at the timeline of this exploit you can see how long it was Zero Day for;

  • The Bug was introduced to JRE on July 28th 2011.
  • It was Disclosed to the public on April 2nd 2012.
  • The Exploit was available in the Metasploit Framework on August 26th 2012. With other PoC’s publicly available around the same time.
  • Detection was available via Snort IDS/IPS on August 28th 2012.
  • Lastly a Patch was available from Oracle on 30th August 2012.

If you compare the date the Bug was introduced and the date of the Patch the Zero Day time is 399 days. Comparing the date of Disclosure with the Patch date is still a staggering 150 days. To put this in perspective, a software bug that affects around 1 Billion devices was able to be exploited for well over a year and certainly was being seen in the wild. Whether you take the view that the Zero Day period is around 150 days (from disclosure)  or over a year (from introduction) both are extremely scary.

So how can you tell whether you were exploited using this JRE bug in the last 6 months or year? How can you prove your network or important systems haven’t been exploited using this vulnerability?

Finding Zero Day attacks

Packetpig provides you with the ability to search vast amounts of network packet captures for Zero Day attacks. To demonstrate this I executed the Metasploit Exploit for the JRE bug against a Windows XP workstation and recorded the packet capture. I then went and hid this 500KB capture amongst 600GB of Full Packet Captures from a system we monitor on the Internet. Every packet is captured to an S3 bucket so we can quickly scan the S3 bucket for Zero Days using Amazon’s Elastic Map Reduce.

So for the purpose of this demonstration as soon as the Snort Signatures were updated on the 28th of August I downloaded them. This allowed me to scan the 600GB of packet captures with the old signatures (in this case 2905) and then again with the new signatures (in this case 2931).

Let’s run through the Packetpig job ‘snort_comparison.pig‘ to see how this was done. The key to understanding the job is that we use the Packetpig SnortLoader() to scan the network packet captures with the old signatures and again with the new signatures. Anything in the old signature scan is removed from the new signature scan leaving only the Zero Day attacks.

In the same way as our last post we setup a number of variables using an include.pig file. After that we define old_snort_conf and new_snort_conf;

%DEFAULT includepath pig/include.pig
RUN $includepath;

%DEFAULT time 60

-- for local mode: uncomment the next line and comment the one after that
--%DEFAULT old_snort_conf 'lib/snort-2905/etc/snort.conf'
%DEFAULT old_snort_conf '/mnt/var/lib/snort-2905/etc/snort.conf'

-- for local mode: uncomment the next line and comment the one after that
--%DEFAULT new_snort_conf 'lib/snort-2931/etc/snort.conf'
%DEFAULT new_snort_conf '/mnt/var/lib/snort-2931/etc/snort.conf'

The SnortLoader() is used with the old snort.conf and the new snort.conf to scan the packet captures;

snort_old_alerts =
    LOAD '$pcap'
    USING com.packetloop.packetpig.loaders.pcap.detection.SnortLoader('$old_snort_conf')
    AS (

snort_new_alerts =
    LOAD '$pcap'
    USING com.packetloop.packetpig.loaders.pcap.detection.SnortLoader('$new_snort_conf')
    AS (
Next we group (COGROUP) the old and the new Snort scans and we filter out any signatures that appear in both;

snort_joined = COGROUP snort_old_alerts BY sig, snort_new_alerts BY sig;
new_only_filtered = FILTER snort_joined BY (COUNT(snort_old_alerts) == 0);

Lastly we re-project the data and then store it. The snort_comparison_new/part-r-00000 file is a verbose version of snort_comparison/summary/part-r-00000.

new_only_flattened = FOREACH new_only_filtered GENERATE FLATTEN(snort_new_alerts);
new_only_summary = FOREACH new_only_filtered GENERATE group, COUNT(snort_new_alerts);

STORE new_only_flattened INTO '$output/snort_comparison_new';
STORE new_only_summary INTO '$output/snort_comparison_summary';

To demonstrate this in practice I test the job on a small number of packet captures on my local development laptop. Watch the video to see how to do it.

Next I take it to the cloud and use 80 x m2.4large instances to process 600GB of full packet captures to find the Oracle JRE 1.7 attack. The 80 nodes spin up, install all the Packetpig software (bootstrap) and then go to work crunching the network packet captures. Check out the video to see the full process.



  • Very good article. How much were your AWS usage fees for the whole process?

    Executives might be interested in seeing the great ROI you have demonstrated; of proactive security measures using cost friendly cloud infrastructures such as AWS.

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    If you have specific technical questions, please post them in the Forums

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>