The Hortonworks Blog

More from Ajay Singh

Sumeet Kumar Agrawal, principal product manager for Big Data Edition product at Informatica, is our guest blogger. In this blog, explains how Informatica’s Big Data Edition integrates with Tez and allow for significant performance gains.

Informatica Big Data Edition’s codeless visual development environment accelerates the ability of enterprises to take advantage of amazing innovations in big data to solve new challenges with skill sets that already exist within many organizations. Informatica natively integrates with big data platforms like Hadoop and NoSQL to enable next-generation big data solutions, including data warehouse optimization and 360 customer analytics.…

We at Hortonworks live by a few core principles:

  • Innovate at the core of Hadoop
  • Make Hadoop be an Enterprise Class Data Platform
  • Do it all in open source
  • Enable the ecosystem

Our vision of “Hadoop Everywhere” is shared by our partner community who bring their industry expertise, unique software value-add and passion for customer success to enable transformational change across our joint customers. We as a Hadoop community are succeeding everyday in transforming enterprises into a data-first organization.…

Apache Hadoop has come along a long way. From its early days as a platform to index the web, it has evolved to its current interactive, real-time, and batch processing capabilities spanning gigabytes to petabytes of content. A key stepping stone in this evolution has been Apache Hadoop YARN. YARN has enabled enterprises to onboard “fit for purpose” processing engines to its Hadoop Data Lake. This has opened the Data Lake to rapid and unbridled innovation by the ISV community and delivered differentiated insight to the enterprise.…

We hosted a webinar on YARN a couple of weeks ago (see the slides and playback here). As you might expect, there was a lot of great questions and here is a set of answers to those questions.

Our next YARN-oriented Office Hours online on Sept 11th at 2pm PST. Join us on Meetup!

Who is using YARN and what benefits have they received from it?

On great public example of in production use of YARN, is at Yahoo!.…

We continue to make strong headway towards the general availability of Hadoop 2.0.  A release candidate for Hadoop 2.1.0- Beta is currently under consideration by the Apache community. This critical milestone signifies both the outstanding progress being made by the community and equally important, the stabilization of Hadoop 2.0 APIs.

A defining characteristic of Hadoop 2.0 is its next generation resource management framework called YARN.  YARN enables Hadoop to grow beyond its MapReduce origins to embrace multiple workloads spanning interactive queries, batch processing, streaming & more.…