Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
November 22, 2013
prev slideNext slide

Fight Fraud with Big Data Analytics

A consequence of living in a globalized, connected world  is the unfortunate presence of online fraud. Fraud applies to all industries and affects businesses of all sizes. Given that we’re coming up on the holidays, and specifically with North America’s love of Black Friday and Cyber Monday, this week we partnered with Datameer on a very topical discussion  about best practices on how to fight fraud using Hortonworks Data Platform to integrate Hadoop and Datameer.

You can catch the recording of the webinar here and read on for some of the detail, including resources and samples to get started.

DatameerMDAIn the webinar, John Kreisa, VP of Strategy Alliance from Hortonworks and Karen Hsu, Sr. Director Product Marketing at Datameer walked us through a modern data architecture as it relates to fraud prevention and responded to questions around how to get started, or details on a reference architecture to support fraud detection capabilities.

Here, Karen Hsu provides a little more background from the webinar and how you can get started on creating your own fraud prevention application with Hadoop and Datameer:

Fraud is especially rampant this time of year and growing each year.  Retailers expect to make up to 40% of their revenue for the year during the holiday season. Yet last year eTailers lost $3.5B to online fraud.  And retailers are not alone.  Recent studies have found merchants paying $200B to $250B in fraud losses annually. Banks and financial organizations are losing $12B to $15B annually. Exacerbating the issue is the high data volumes—over 20B for credit card transactions alone annually.

And the face of fraud is changing. Instead of stealing a credit card and using it to buy big screen TVs, credit card thieves have become more sophisticated. For example, they can now make numerous, small transactions that are seemingly benign. But if a single credit card holder is making 100 $5 margarita transactions at multiple locations at the same time, something is wrong. By analyzing large volumes of complex data—including point of sale, geolocation, authorization, and transaction data– with Datameer on Hadoop, companies were able to identify fraud patterns in historical data.

The reference architecture required to support fraud detection in this new world needs to support business user focused big data analytics applications on top of a Hadoop based architecture.  Types of analytics used to detect fraud include:

  • Identifying outlying spend and affected vendors
  • Data mining and machine learning on transaction data
  • Predictive modeling (e.g. back-propagation)


Get Started

  • Click here to get the Datameer Playground (Datameer on Hortonworks Sandbox, a turnkey Hadoop evaluation environment).
  • Then click here to find documentation and a sample fraud application you can start with to identify credit card fraud.


Tony Simon says:
Your comment is awaiting moderation.

Nice article. Its true face of fraud is changing. Also the demand for this in business. is offering a course on fraud detection in R. Big data combination with R seems to be on the rise.

a says:

Aw, this was an exceptionally nice post. Spending some time and actual effort to produce a superb article… but what can I say… I put things off a whole lot
and don’t seem to get nearly anything done.

Tom says:

how does this compare/fit with a file transfer offering?

Leave a Reply

Your email address will not be published. Required fields are marked *