Our guest blogger is Bob Taylor, Alliances Director at Concurrent, a Hortonworks Technology Partner. In this blog, Bob describes three factors that helped in the success of HomeAway in their big data initiative and are applicable to all projects. HomeAway is a customer of Hortonworks and Concurrent.
HomeAway is a great example of an organization that has found value from their Big Data investment because of three factors. One HomeAway initiative gathers customer preference data from dozens of websites and uses it to refine their marketing and, in turn, increase bookings. Learn more about their success, in their words, in an upcoming webinar on Nov. 10. Concurrent, Hortonworks and HomeAway will be available to answer questions at the end of the event.
For many organizations, their efforts start with ETL offloading projects because of the obvious cost savings that can be achieved by moving expensive, process intensive, ETL jobs to Hadoop. However, success comes when a critical mass of processes are moved as these processes tend to be business critical. Other organizations may use Hadoop to innovate and build new revenue generating (or protecting) applications. These include fraud prevention, customer analytics for targeting marketing (i.e. preferences, sentiment, etc.), search and recommendation engines. Regardless of which type of initiative you are implementing, there are three key factors that can help ensure success:
1. The initiative is deemed business critical by senior leadership
- With cost savings initiatives, you need to ensure that the first set of processes moved are critical to the business for analytics, reporting, etc. and resource intensive. With innovative initiatives, securing senior leadership sponsorship is a little easier but ROI needs to be proven faster. Most organizations try a mix of cost savings and innovative initiatives to ensure continued senior leadership sponsorship and expect an average of 1-2 years to achieve to ROI.
2. The development framework(s) and platform will leverage existing resources skills
- Most of the industry reports still list a lack of availability of skill resources as a barrier for adoption. There are still a lot of Pig, Hive and MapReduce jobs being developed directly in Hadoop and this does require new skills. Organizations that achieved ROI, quickly, have largely minimized the number of applications they are developing directly in Hadoop and have moved to some sort of abstraction layer, beyond Hive and Pig.
- You can use GUI-based tools like Informatica or SnapLogic for ETL processes but they are limited beyond that use case. Cascading is a Java-based open source API framework. Scalding, also open source, supports Scala-based development contributed by Twitter. More recently, of course, is Apache Spark. These are examples… there are more.
3. There is a high level of operational transparency
- Many Big Data projects start as experiments in one or more business teams. As you take your Hadoop infrastructure from experiment to a production environment, be sure it has operational transparency and is integrated into existing operational support systems. This provides business teams with confidence in the production readiness of the environment and that their data will be delivered within service levels.
- To do this, provide operations business teams with visibility into application performance, not just cluster performance. This will let you know who are using resources and what is consuming them. Without application level performance visibility, it is difficult to maintain reliable service levels at scale.
HomeAway has found value from their Big Data investment because of these three factors. To hear more about HomeAway’s big data projects, join us on Nov. 10th at 11:00 AM PT for a webinar with Rene, Austin, Michael and Francois from the HomeAway team. Learn how they successfully implemented shared Hadoop services with HDP, are rapidly on-boarding new developers with a short learning curve, and have achieved operational excellence. Register here.