The Hortonworks Blog

This is a guest blog from Stefan Kupstaitis-Dunkler, Accenture Technology Solutions GmbH.

I’ve been working at Accenture for almost a year and last month I was invited to attend the partner masterclass on HDP 2.3 Security. The classroom setting was a great forum for interactive discussions and a showcase of the security capabilities in the newest version of the Hortonworks Hadoop distribution, HDP 2.3.

Sean Roberts, Hortonworks Solution Engineer and hadoop operations expert in EMEA, guided the attendees through a demonstration of what Hortonworks Data Platform does to integrate the security aspects into Hadoop.…

We are very pleased to announce that Hortonworks Data Platform (HDP) Version 2.3 is now generally available for download. HDP 2.3 brings numerous enhancements across all elements of the platform spanning data access to security to governance. This version delivers a compelling new user experience, making it easier than ever before to “do Hadoop” and deliver transformational business outcomes with Open Enterprise Hadoop.

As we announced at Hadoop Summit in San Jose, there are a number of significant innovations as part of this release including:

HDP 2.3 represents the very latest innovation from across the Hadoop ecosystem.…

Apache Hadoop has emerged as a critical data platform to deliver business insights hidden in big data. As a relatively new technology, system administrators hold Hadoop to higher security standards. There are several reasons for this scrutiny:

  • External ecosystem that comprise of data repositories and operational systems that feed Hadoop deployments are highly dynamic and can introduce new security threats on a regular basis.
  • Hadoop deployment contains large volume of diverse data stored over longer periods of time.

Hadoop isn’t optional for today’s enterprises—that much is clear. But as companies race to get control over the significantly growing volumes of unstructured data in their organizations, they’ve been less certain about the right way to put Hadoop to work in their environment.

We’ve already seen a variety of wrong approaches with proprietary extensions that limit innovation, fragment architectures and trade openness for vendor lock-in. Now a new consensus is forming around an emerging category that drives truly transformational outcomes: Open Enterprise Hadoop.…

Over the past two quarters, Hortonworks has been able to attract over 200 new customers. We are attempting to feed the hunger our customers have shown for Hadoop over the past two years. We are seeing truly transformational business outcomes delivered through the use of Hadoop across all industries. The most prominent use cases are focused on:

  • Data Architecture Optimization – keeping 100% of the data at up to 1/100th of the cost while enriching traditional data warehouse analytics
  • A Single View of customers, products, and supply chains
  • Predictive Analytics – delivering behavioral insight, preventative maintenance, and resource optimization
  • Data Discovery – exploring datasets, uncovering new findings, and operationalizing insights

What we have consistently heard from our customers and partners, as they adopt Hadoop, is that they would like Hortonworks to focus our engineering activities on three key themes: Ease of Use, Enterprise Readiness, and Simplification.…

SQL is the most popular use case for the Hadoop user community, and Apache Hive is still the defacto standard. Early this week, the Apache Hive community released Apache Hive 1.2.0.

Already the third release this year, the Hive developer community continues to improve the release and grow its team, with 11 Hive contributors promoted to committers in the last three months. Dedicated to make Hive enterprise-ready, the community has made improvements in the following areas:

  • Additional SQL functionality
  • Security enhancements
  • Performance gains
  • Stability and usability
  • For the complete list of features, improvements, and bug fixes, see the release notes.…

    In this guest blog, Sumeet Kumar Agrawal, principal product manager for Big Data Edition product at Informatica, explains how Informatica’s Big Data Edition integrates with Hortonworks’ security projects, and how you can secure your big data projects.

    Many companies already use big data technology like Hadoop for their production environments, so they can store and analyze petabytes of data including transactional data, weblog data, and social media content to gain better insights about their customers and business.…

    Historically, the strength of a platform lies in the abilities of developers to learn, try, and build against the platform APIs and capabilities. As Apache Hadoop matures as a platform, it’s the creativity and efforts of the developer community that is driving the innovation that makes Hadoop a vibrant and impactful foundation of a modern data architecture.

    A successful developer community leads to a successful platform, and at Hortonworks we are committed to reducing the friction to speed up the success of our customers.…

    With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it in different ways. As YARN propels Hadoop’s emergence as a business-critical data platform, the enterprise requires more stringent data security capabilities. The Apache Knox Gateway (“Knox”) provides HTTP based access to resources of the Hadoop ecosystem so that enterprises can confidently extend Hadoop access to more users, while maintaining compliance with enterprise security policies.…

    Hortonworks Data Platform (HDP) provides centralized enterprise services for comprehensive security to enable end-to-end protection, access, compliance and auditing of data in motion and at rest. HDP’s centralized architecture—with Apache Hadoop YARN at its core—also enables consistent operations to enable provisioning, management, monitoring and deployment of Hadoop clusters for a reliable enterprise-ready data lake.

    But comprehensive security and consistent operations go together, and neither is possible in isolation.

    We published two blogs recently announcing Ambari 2.0 and its new ability to manage rolling upgrades.…

    Advances in Hadoop security, governance and operations have accelerated adoption of the platform by enterprises everywhere. Apache Ambari is the open source operational platform for provisioning, managing and monitoring Hadoop clusters from a single pane of glass, and with the Apache Ambari 1.7.0 release last year, Ambari made it far easier for enterprises to adopt Hadoop.

    Today, we are excited to announce the community release of Apache Ambari 2.0, which will further accelerate enterprise Hadoop usage by simplifying the technical challenges that slow adoption the most.…

    As we are finalizing our preparations for what will surely be another successful Hadoop Summit Europe event, one thing has become unequivocally clear: the Hadoop challenge is no longer about acceptance. It’s no longer about adoption. It’s about Hadoop being pervasive. Hadoop is everywhere.

    As Mike Gualtieri of Forrester wrote in a recent report:

    Hadoop is a must-have for large enterprises

    I couldn’t agree more with Mike’s assessment, and I encourage you to read the report: “Predictions 2015: Hadoop Will Become a Cornerstone of Your Business Technology Agenda”.…

    At the beginning of February, HP announced their intent to acquire Voltage Security to expand data encryption security solutions for Cloud and Big Data. Today, both companies share their thoughts about the acquisition. Carole Murphy, Director Product Marketing at Voltage Security, and Albert Biketi, Vice President and General Manager at HP Atalla, tell us more about how HP extends the capabilities of every product in the Voltage portfolio, including Voltage’s leadership in securing Hadoop data with data-centric, standards-based technologies.…

    Forrester recently called Apache Hadoop adoption “mandatory” for the enterprise. For most organizations, moving forward with Hadoop is no longer a question of if, but when. Hadoop-powered insight into big data is enabling market disruption in every industry and the market winners are those who handle that data most effectively and at the lowest cost.

    As with any new platform, making decisions on how best to implement and for what purpose can be challenging.…

    Since our founding in 2011, Hortonworks has had a fundamental belief: the only way to deliver infrastructure platform technology is completely in open source. Moreover, we believe that collaborative open source software development under the governance model of an entity like the Apache Software Foundation (ASF) is the best way to accelerate innovation that targets enterprise end users since it brings the largest number of developers together in a way that enables innovation to happen far faster than any single vendor could achieve and in a way that is free of friction for the enterprise.…