The Hortonworks Blog

The Apache Knox Gateway team is pleased to announce Knox’s first release as an Apache top-level project: Apache Knox Gateway 0.4.0. The team resolved approximately 100 JIRAs for this release and Knox Gateway is now better positioned to provide complete security for REST API access to a Hadoop cluster.

The new features in Knox Gateway 0.4.0 are the features that enterprise security officers expect in a gateway solution:

  • Perimeter security for a Hadoop cluster
  • Support for enterprise group lookup
  • Audit log of all gateway activity
  • Command line tooling for CMF provisioning
  • Protection for web application vulnerabilities
  • Pre-authentication via SSO token
  • And many more…

As a top-level project, Apache Knox Gateway is fully endorsed by the Apache Software Foundation, and this improves coordination between development of Knox and the other core Hadoop projects with which it interacts.…

Yesterday the Apache Ambari community proudly released version 1.5.1. This is the result of constant, concerted collaboration among the Ambari project’s many members. This release represents the work of over 30 individuals over 5 months and, combined with the Ambari 1.5.0 release, resolves more than 1,000 JIRAs.

This version of Ambari makes huge strides in simplifying the deployment, management and monitoring of large Hadoop clusters, including those running Hortonworks Data Platform 2.1.…

Three weeks ago, we announced availability of the technical preview of Hortonworks Data Platform (HDP) version 2.1 and since then we have had thousands of downloads of this preview.  We also promised delivery of GA bits on April 22nd  and we are delighted to deliver as stated. HDP 2.1, which includes countless new features across seven new components, is available today from our download page

YARN unlocks the Data Lake

YARN, the resource management layer of Hadoop 2 is delivering value as it has unlocked the data lake vision for many.…

The Apache Hive community has voted on and released version 0.13 today. This is a significant release that represents a major effort from over 70 members who worked diligently to close out over 1080 JIRA tickets.

Hive 0.13 also delivers the third and final phase of the Stinger Initiative, a broad community based initiative to drive the future of Apache Hive, delivering 100x performance improvements at petabyte scale with familiar SQL semantics.…

The power of a well-crafted speech is indisputable, for words matter—they inspire to act. And so is the power of a well-designed Software Development Kit (SDK), for high-level abstractions and logical constructs in a programming language matter—they simplify to write code.

In 2007, when Chris Wensel, the author of Cascading Java API, was evaluating Hadoop, he had a couple of prescient insights. First, he observed that finding Java developers to write Enterprise Big Data applications in MapReduce will be difficult and convincing developers to write directly to the MapReduce API was a potential blocker.…

As enterprises build new applications with the data they cost effectively capture and process with Apache Hadoop it is important for the platform to facilitate the app dev processes. That’s why we are excited to announce that we’ve expanded our partnership with Concurrent, Inc. to simplify and accelerate application development on Hadoop.

There are two components to this expanded partnership.

The Internet of Things (IoT) is in its infancy. You can buy wireless bathroom scales to upload data to monitoring tools helping you manage your weight. You can buy a connected refrigerator that keeps track of the inventory to remind you what you need to buy. It’s fascinating to think about the future of possibilities. In a recent podcast on the SAP Future of Business with Game-Changers Radio, panelist Matt Healey (Analyst at Technology Business Research) commented that he wasn’t ready for the day when his scale and refrigerator talked.…

LOOK Innovative is a new consulting partner of Hortonworks specializing in business applications of Hadoop for retail vertical market.

LOOK Innovative concentrates on delivering the complete Omni-Channel digital experience to retailers, which is the evolution of multi-channel retailing. Omni-Channel is a seamless approach for the consumer through all available shopping channels, including mobile internet devices, computers, bricks-and-mortar, television, radio, direct mail, catalog and so on. It means that consumers make buying decisions based on information from many sources and may purchase through any of those sources – they might research online but buy at the local store and may research at the store but buy online.…

The third HBaseCon is happening in May 5th this year in San Francisco which is THE community event for Apache HBase. As with the previous years, this year the agenda is quite exciting.

There will be 4 tracks, Operations, Features and Internals, Ecosystem and Case Studies. The keynotes will include speakers from Cloudera who is the event host, Google BigTable team as a follow up to their ‘06 BigTable paper, Salesforce on their experience with HBase operations and use cases and Facebook on their strongly consistent multi data center replication scheme.…

As the Red Hat Summit shifts to the west coast in San Francisco this year Hortonworks and Red Hat will be demonstrating the progress of our engineering efforts. Our engineers have been hard at work in the factories and in the communities deeply integrating our open source offerings to create a comprehensive platform for new analytic applications. As a reminder in February Red Hat and Hortonworks announced a comprehensive open source initiative to deliver infrastructure solutions to bring 100-percent open source Hadoop to the hybrid cloud.…

It gives me great pleasure to announce that the Apache Hadoop community has voted to release Apache Hadoop 2.4.0! Thank you to every single one of the contributors, reviewers and testers!

The community fixed 411 JIRAs for 2.4.0 (on top of the 511 JIRAs resolved for 2.3.0). Of the 411 fixes:

  • 50 are in Hadoop Common,
  • 171 are in HDFS,
  • 160 are in YARN and
  • 30 went into MapReduce

Hadoop 2.4.0 is the second Hadoop release in 2014, following Hadoop 2.3.0’s February release and its key enhancements to HDFS such as Support for Heterogeneous Storage and In-Memory Cache.…

One of the key concerns in the financial industry today is the alarming increase in fraudulent activities.  It is estimated that over $12 billion is spent on fraud detection and prevention and that number is projected to increase significantly over the next few years. Customer data gets compromised and this leads to a decreased level of customer satisfaction and retention, which results in revenue declines for financial organizations.

Join Hortonworks, Skytree and Forrester Research for a Webinar on April 15, 8am PST/11am EST

As financial institutions continue to embrace the adoption of big data infrastructures like the Hortonworks Data Platform based on Hadoop, there is a wealth of information collected that can help with more sophisticated fraud detection. …

We are excited to announce that the Apache™ Tez community voted to release version 0.4 of the software.

Apache Tez is an alternative to MapReduce that provides a powerful framework for executing a complex topology of tasks for data access in Hadoop. Version 0.4 incorporates the feedback from extensive testing of Tez 0.3, released just last month.

This release is especially meaningful because it coincides with completion of the Stinger Initiative (a collaborative community effort involving 145 developers across 44 companies) and the upcoming release of Apache Hive 0.13.…

Securing any system requires you to implement layers of protection.  Access Control Lists (ACLs) are typically applied to data to restrict access to data to approved entities. Application of ACLs at every layer of access for data is critical to secure a system. The layers for hadoop are depicted in this diagram and in this post we will cover the lowest level of access… ACLs for HDFS.

This is part of the HDFS Developer Trail series.  …

Yesterday our partner Teradata announced a new capability called Teradata QueryGrid that further deepens the integration between the Teradata Data Warehouse and the Hortonworks Data Platform. This announcement is important because it delivers on the promise and the value of the Modern Data Architecture by demonstrating how the two technologies complement each other for the enterprise.

Teradata pioneered deeper integration with Apache Hadoop through integration with H-Catalog initially with Aster SQL-H and then the Data Warehouse and now they have taken it to the next level with Teradata QueryGrid.…

Go to page:« First...45678...203040...Last »