The Hortonworks Blog

Customers’ Hadoop Journey

We’ve all had two weeks to reflect on Hadoop Summit 2014. One of the biggest differences that stood out in this year’s Summit (as compared to Summit 2013) was the presence of large enterprise customers that are using Apache Hadoop as an important part of their modern data architectures.

Hadoop has gone beyond its original Yahoo use case—indexing the web via a nightly batch MapReduce process —and into the mainstream of daily data processing and analytics with real-time, online, interactive, and batch applications at many notable companies.…

Big Data In Healthcare

Electronic data is the heartbeat in a healthcare provider’s office. ZirMed is a Hortonworks customer and a leading provider of healthcare information management solutions. Healthcare providers, including physicians, hospitals and large health systems, use the company’s cloud-based revenue cycle management offerings to manage the complex process of billing and collecting revenue from patients and payers.

ZirMed’s Analytics solution aggregates healthcare data and makes it available to its customers, so they get a clearer view of their financial and operational performance.…

Data Analytics Virtual Event

Hortonworks and Teradata have partnered to provide a clear path to Big Data Analytics via stable and reliable Hadoop for the enterprise. We are excited to support their upcoming Big Data Analytics virtual event, “Data Discovery in Action.” We will have experts standing by to help answer questions to help ensure you have the right strategy in place for all of your big data.

At this event on July 2 nd, you will learn more about how Teradata’s Unified Big Data Architecture™ provides a quick path to data discovery.…

We’re finally catching our breath after a phenomenal Hadoop Summit event last week in San Jose.  Thank you to everyone that came to participate in the celebration of Hadoop advances and adoption—from many of the organizations that shared their Hadoop journey with us that fundamentally transformed their businesses, to those just getting started, to the huge ecosystem of vendors. It is amazing to be part of such a broad and deep community that is contributing to making the market for everyone.…

Apache YARN, Apache Slider, and Docker

Join us June 19 at 6 pm at the Hilton Fort Worth, Texas for an educational workshop hosted by Hortonworks and Sendero Business Services. The topic is “The Key To Success is Consistently Making Good Decisions & The Key To Good Decisions is Good Information.” The speaker is Don Hilborn, Solutions Engineer at Hortonworks.

Don will introduce the paradigm of

  • Efficiency – double processing in Hadoop on the same hardware while providing predictable performance and quality of service; and
  • Resource sharing – providing a stable common set of shared resources across multiple, coordinated workloads in Hadoop.

More and more solution providers are integrating with Hortonworks Data Platform to provide their customers with enterprise Hadoop.

As part of our HDP 2.1 certification series, I would like to introduce Greg Benson, Chief Scientist at SnapLogic. In this blog, Greg provides some insights about the value of obtaining HDP 2.1 certification and the benefits of integration platform as a service (iPaaS). 

SnapLogic provides a cloud-based service for performing a wide range of data and application integration tasks.…

Informatica is a Hortonworks Certified Technology Partner. This partnership makes it possible for organizations to use all the data internal and external to an enterprise to achieve the full predictive power that drives the success of modern data-driven businesses. 

That is why we’re excited to have John Haddad, Senior Director, Informatica to be our guest blogger. In this blog, John explores the benefits of certification on HDP 2.1.

When I was in high school, one of my best friends had a water ski boat we often took out on California lakes (what are friends for?).…

We are less than a week away from start of the seventh annual Hadoop Summit San Jose. With all of the final preparations underway, we wish to highlight some of the not to be missed activities in and around the event. The event is filling fast, but you can still register here.

Here are a few things you don’t want to miss!

  •  Great track content—there is more content than ever with more than 120 informative sessions on Apache Hadoop and related technologies for you to choose from and as always selected by the community and delivered by the experts themselves.
  • Trifacta is a Hortonworks Technology Partner, a pioneer in data transformation, recently certified with HDP 2.1. Here, Trifacta’s CTO and Co-founder Sean Kandel, talks about their Predictive Interaction ™ solution with Hortonworks Data Platform.

    “I spend more than half my time integrating, cleansing and transforming data without doing any actual analysis. Most of the the time I’m lucky if I get to do any analysis.” – Data Scientist [1]

    The most commonly reported use of Hadoop today is data transformation. …

    The Apache Ambari community is happy to announce last week’s release of Apache Ambari 1.6.0, which includes exciting new capabilities and resolves 288 JIRA issues.  

    Many thanks to all of the contributors in the Apache Ambari community for the collaboration to deliver 1.6.0, especially with Blueprints, a crucial feature that enables rapid instantiation and replication of clusters.

    Each release of Ambari makes substantial strides in providing functionality to simplify the lives of system administrators and dev-ops engineers to deploy, manage, and monitor large Hadoop clusters, including those running on Hortonworks Data Platform 2.1 (HDP).…

    Customers want to make more rapid, data-driven decisions but historically this has been challenging in the era of Big Data. Predictive analytics, machine learning and statistical algorithms are at the leading edge of where enterprises can unlock the value hidden in their data to deliver timely insights for intelligent decisions.

    Zementis is a new Hortonworks Technology Partner offering a standards-based predictive analytics scoring engine for Hortonworks Data Platform (HDP) and existing data repositories as part of the Modern Data Architecture (MDA).…

    In this blog, Paul Phillips, EMEA Sales Director at Hortonworks, discusses the importance of extending big data science courses to PhD students and scientists. This joint venture with KPMG provides an opportunity to “bring excellent basic skills that are useful in data science and this programme aims to commercialize these skills and ease the path to a data science profession.”

    At Hortonworks, we encourage our team members to innovate and as the Open Source community grows, it is also vital that we play our part to ensure the community is continually reinvigorated with new ideas and innovation. …

    According to New York Observer, there were couple of major social reasons that spurred the genesis and growth of Meetup.com. First, it was Robert Putman’s book Bowling Alone, in which he talks about the collapse of communities in America. And the second was an event that not only changed the world but changed New York: it was the aftermath of September 11, where strangers cared about greeting, meeting, and talking.…

    MongoDB is an open-source NoSQL database, used by companies of all sizes, across all industries and for a wide variety of applications. MongoDB – the company – is a Hortonworks Certified Technology Partner.

    Sheena Badani, Director of Business Development at MongoDB, talks about the value of obtaining HDP 2.1 certification.

    MongoDB is thrilled to announce the certification of the MongoDB Hadoop Connector on Hortonworks latest release HDP 2.1.  Customers now have validation from both MongoDB, Inc.…

    Hurry, time is running out to join Mike Ferguson, independent analyst and thought leader in Business Analytics, Big Data, Data Management and Smart Business, as he explores how the growing business demand to analyze new sources of data is impacting on traditional architectures and how these architectures need to change to accommodate big data analytical workloads.

    In this brief session, Mike looks at new analytical use cases, the types of data that need to be analyzed, and the role of Hadoop in a modern analytical environment.…