Advertisers Do Hadoop
Consumers have never generated so much data on how they research, discuss and buy products. This new data is valuable for shaping and promoting a brand or product, but it doesn’t line up neatly to fit in pre-defined, tabular formats. Apache Hadoop brings this “new” data under analysis, by ingesting social media, clickstream, video and transaction data without requiring a pre-defined data schema.
Then this data can be joined with existing structured data sets for deeper sentiment analysis and targeted promotion. Now media companies, agencies and enterprises can store new types of data and retain everything for longer, to advertise for customer loyalty and return on investment.
The following reference architecture diagram represents a combination of approaches that we see our advertising and media customers adopt, whether they advertise groceries, home improvement programming, kids toys, or anything and everything on a retail website.
Here are some specific ways that our media and advertising customers use HDP to improve their bottom lines.
Mine Grocery & Drug Store POS Data to Identify High-Value Shoppers
One marketing analytics company specializes in gathering insight at the checkout counter, across many grocers and drug stores. They mine this sales information for basket analysis, price sensitivity, and demand forecasts.
Interactive query with the Stinger Initiative and Apache Hive running on YARN help the company rapidly process terabytes of data to keep pace with a market that changes by the day. Manufacturers, retailers, and ad agencies use the combined analysis to position their brands or improve their retail experiences, particularly for high-value customers.
Target Ads to Customers in Specific Cultural or Linguistic Segments
Hortonworks customer Luminar is the leading big data analytics and modeling provider uniquely focused on delivering actionable advertising insights on U.S. Latino consumers. Luminar wanted to move beyond samples of Latinos living in the United States and towards empirical analysis of actual data on all US Latinos. And they did not want to store only some transactions from one or two sources; they wanted to acquire and save as many transactions as possible from as many different sources as possible.
Now HDP interacts easily with other components of Luminar’s data and business intelligence ecosystem: Amazon Cloud, R, Talend and Tableau. The company has increased ingest of transaction data from 300 sources to 2000, up from 2 to 15 terabytes per month. Before, it took Luminar days to ingest and join a new set of raw data, now it takes only hours, even with eight times more data than before.
Luminar uses that insight to craft marketing strategies for CPG and entertainment companies that want to focus on the US Latino population.
Syndicate Videos According to Behavior, Demographics & Channel
A major omni-media company specializes in home improvement and DIY content distributed across television, digital, mobile and publishing channels. One of its divisions focused on delivering online video ads.
Both content syndicators and publishers want to make sure that video content reaches the right audience. The company analyzes clickstream data stored on HDP for audience analysis that feeds a recommendation engine for improved ad consumption.
ETL Toy Market Research Data for Longer Retention & Deeper Insight
A leading consumer research firm provides consumer intelligence to the toy industry. The market is in a state of flux; there are more new digital options and “real world” forms of children’s play than ever before. The company delivers weekly point-of-sale (POS) tracking information for competitive insight on toy sales trends. They cover all the major toy retailers for a complete view of the marketplace.
The company chose HDP to offload much of its data from a more expensive platform, with expected savings of more than $1 million annually. The improved economics allow the company to retain data longer and identify long-range, strategic opportunities for growth. This helps its toy company clients partner more closely with retailers.
Optimize Online Ad Placement for Retail Websites
One of our customers provides web analytics services to some of the world’s largest retail websites. For their largest customer, clickstream data pours in at the rate of hundreds of megabytes per hour, which adds up to billions of rows per month. The agency analyzes each ad’s placement and determines click-through and conversion rates. When impression files and click files were stored in a relational database, the agency had no way to intelligently connect impressions to clicks. So they had to guess.
Now HDP replaces that guess work with empirical science and confident analysis by week, by day or by hour. The agency can also filter by the consumer’s OS, browser, device and geographical location. With Hadoop’s economies of scale, data storage costs are significantly lower than before, and data can be retained for longer. So the agency and its customers all look forward to looking back on years (not weeks) of clickstream data.
The agency’s retail customers can now tell if consumers are clicking on their website while standing in one of their stores. This provides valuable insight to manage “showrooming” behavior where customers visit a store to touch a product and then drive home to buy it online. Retailers can address showrooming without slashing prices, and data in HDP reveals specific tactics for doing so.