The Hortonworks Blog

Posts categorized by : HDP
Customers’ Hadoop Journey

We’ve all had two weeks to reflect on Hadoop Summit 2014. One of the biggest differences that stood out in this year’s Summit (as compared to Summit 2013) was the presence of large enterprise customers that are using Apache Hadoop as an important part of their modern data architectures.

Hadoop has gone beyond its original Yahoo use case—indexing the web via a nightly batch MapReduce process —and into the mainstream of daily data processing and analytics with real-time, online, interactive, and batch applications at many notable companies.…

Big Data In Healthcare

Electronic data is the heartbeat in a healthcare provider’s office. ZirMed is a Hortonworks customer and a leading provider of healthcare information management solutions. Healthcare providers, including physicians, hospitals and large health systems, use the company’s cloud-based revenue cycle management offerings to manage the complex process of billing and collecting revenue from patients and payers.

ZirMed’s Analytics solution aggregates healthcare data and makes it available to its customers, so they get a clearer view of their financial and operational performance.…

Apache YARN Ready Program

With the release of Apache Hadoop YARN in October of last year, organizations are moving from single-application Hadoop clusters to a versatile, integrated Hadoop 2 data platform hosting multiple applications — eliminating silos, maximizing resources and bringing true multi-workload capabilities to Hadoop.

Customers are telling us loud and clear: they want solutions that run on YARN because it enables them to run multiple workloads on the same common data pool.…

We recently hosted the fifth of our seven Discover HDP 2.1 webinars, entitled Apache Solr for Hadoop Search. Over 200 people attended the webinar, prompting an informative discourse.

The speakers outlined the Apache Solr overview and features, followed by a practical demo of how to process, index, search, and visualize server log data.

Thanks to our presenters Justin Sears (Hortonworks’ Product Marketing Manager), Rohit Bakhshi (Hortonworks’ senior product manager), and Paul Codding (Hortonworks’ Solution Engineer) who presented the webinar.…

Data Analytics Virtual Event

Hortonworks and Teradata have partnered to provide a clear path to Big Data Analytics via stable and reliable Hadoop for the enterprise. We are excited to support their upcoming Big Data Analytics virtual event, “Data Discovery in Action.” We will have experts standing by to help answer questions to help ensure you have the right strategy in place for all of your big data.

At this event on July 2 nd, you will learn more about how Teradata’s Unified Big Data Architecture™ provides a quick path to data discovery.…

We’re finally catching our breath after a phenomenal Hadoop Summit event last week in San Jose.  Thank you to everyone that came to participate in the celebration of Hadoop advances and adoption—from many of the organizations that shared their Hadoop journey with us that fundamentally transformed their businesses, to those just getting started, to the huge ecosystem of vendors. It is amazing to be part of such a broad and deep community that is contributing to making the market for everyone.…

Enterprises are using Apache Hadoop powered by YARN as a Data Operating System to run multiple workloads and use cases instead of using it just as a single purpose cluster.

A multi-purpose enterprise wide data platform often referred to as a data lake gives rise to the need for a comprehensive approach to security across the Hadoop platform and the workloads. Few weeks back Hortonworks acquired XA Secure to further execute on our vision to bring a holistic security framework to the Hadoop community irrespective of the workload.…

Apache YARN, Apache Slider, and Docker

Join us June 19 at 6 pm at the Hilton Fort Worth, Texas for an educational workshop hosted by Hortonworks and Sendero Business Services. The topic is “The Key To Success is Consistently Making Good Decisions & The Key To Good Decisions is Good Information.” The speaker is Don Hilborn, Solutions Engineer at Hortonworks.

Don will introduce the paradigm of

  • Efficiency – double processing in Hadoop on the same hardware while providing predictable performance and quality of service; and
  • Resource sharing – providing a stable common set of shared resources across multiple, coordinated workloads in Hadoop.

This is the second in the series of blogs exploring how to write data-driven applications in Java using the Cascading SDK. The series are:

  • WordCount
  • Log Parsing
  • Historically, programming languages and software frameworks have evolved in a singular direction, with a singular purpose: to achieve simplicity, hide complexity, improve developer productivity, and make coding easier. And in the process, foster innovation to the degree we have seen today—and benefited from.

    Anyone among you is “young” enough to admit writing code in microcode and assembly language?…

    With the release of Apache Hadoop YARN in October of last year, organizations are moving from single-application Hadoop clusters to a versatile, integrated Hadoop 2 data platform hosting multiple applications — eliminating silos, maximizing resources and bringing true multi-workload capabilities to Hadoop.  Many enterprises have adopted YARN as the architectural center of a set of integrated technologies and capabilities that form the blueprint for enterprise Hadoop.

    YARN Enabling the Ecosystem Technologies

    Hortonworks is making it easier to develop YARN applications through a number of technologies. …

    More and more solution providers are integrating with Hortonworks Data Platform to provide their customers with enterprise Hadoop.

    As part of our HDP 2.1 certification series, I would like to introduce Greg Benson, Chief Scientist at SnapLogic. In this blog, Greg provides some insights about the value of obtaining HDP 2.1 certification and the benefits of integration platform as a service (iPaaS). 

    SnapLogic provides a cloud-based service for performing a wide range of data and application integration tasks.…

    Apache Ambari has always provided an operator the ability to provision an Apache Hadoop cluster using an intuitive Cluster Install Wizard web interface, guiding the user through a series of steps:

    • confirming the list of hosts
    • assigning master, slave, and client components to configuring services, and
    • installing, starting and testing the cluster.

    With Ambari Blueprints, system administrators and dev-ops engineers can expedite the process of provisioning a cluster. Once defined, Blueprints can be re-used, which facilitates easy configuration and automation for each successive cluster creation.…

    Since the partnership between Hortonworks and Splunk and the release of Hunk last year, we have created some awesome assets (i.e., Hunk sandbox tutorial, 360-degree customer view webinar) that have enabled Hadoop and Big Data enthusiasts’ hands-on training with Big Data. You can find more details around our partnership and resources here: http://hortonworks.com/partner/splunk/

    As part of our HDP 2.1 certification series, I would like to introduce Brett Sheppard, Director of Product Marketing for Big Data at Splunk.…

    We recently hosted the fourth of our seven Discover HDP 2.1 webinars, entitled Apache Hadoop 2.4.0, HDFS and YARN. It was very well attended and a very informative discourse. The speakers outlined the new features in YARN and HDFS in HDP 2.1 including:

    • HDFS Extended ACLs
    • HTTPs support for WebHDFS and for the Hadoop web UIs
    • HDFS Coordinated DataNode Caching
    • YARN Resource Manager High Availability
    • Application Monitoring through the YARN Timeline Server
    • Capacity Scheduler Preemption

    Many thanks to our presenters, Rohit Bakhshi (Hortonworks’ senior product manager), Vinod Kumar Vavilapalli (co-author of the YARN Book, PMC, Hadoop YARN Project Lead at Apache and Hortonworks), and Justin Sears (Hortonworks’ Product Marketing Manager).…

    Informatica is a Hortonworks Certified Technology Partner. This partnership makes it possible for organizations to use all the data internal and external to an enterprise to achieve the full predictive power that drives the success of modern data-driven businesses. 

    That is why we’re excited to have John Haddad, Senior Director, Informatica to be our guest blogger. In this blog, John explores the benefits of certification on HDP 2.1.

    When I was in high school, one of my best friends had a water ski boat we often took out on California lakes (what are friends for?).…