cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

The Hortonworks Blog

We recently hosted a webinar on the newest features of Hortonworks DataFlow 2.0 highlighting: the new user interface new processors in Apache NiFi Apache NiFi multi-tenancy Apache NiFi zero master clustering architecture Apache MiNiFi One of the first things you may have noticed in Hortonworks DataFlow 2.0 is the new user interface based on Apache […]

The 100% open source and community driven innovation of Apache Hive 2.0 and LLAP (Long Last and Process) truly brings agile analytics to the next level. It enables customers to perform sub-second interactive queries without the need for additional SQL-based analytical tools, enabling rapid analytical iterations and providing significant time-to-value. TRY HIVE LLAP TODAY Read about […]

Apache Hive(™) is the most complete SQL on Hadoop system, supporting comprehensive SQL, a sophisticated cost-based optimizer, ACID transactions and fine-grained dynamic security. Though Hive has proven itself on multi-petabyte datasets spanning thousands of nodes many interesting use cases demand more interactive performance on smaller datasets, requiring a shift to in-memory. Hive 2 marks the […]

Hortonworks Empowers Organizations to Maximize the Outcome of their Big Data Initiatives through improvements in security, governance, and operations. We are very pleased to announce that Hortonworks Data Platform (HDP) Version 2.5 is now generally available for download. As part of a Open and Connected Data Platforms offering from Hortonworks, HDP 2.5 brings a variety of […]

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week. Top Articles from HCC Implementing a real-time Hive Streaming example by:mjohnson The Hive Streaming API enables the near real-time data ingestion into Hive. This two part posting reviews some of […]

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week. Top Articles from HCC HDF installation on EC2 by:mpandit Hortonworks DataFlow (HDF) powered by Apache NiFi, Kafka and Storm, collects, curates, analyzes and delivers real-time data from the IoAT to […]

Apache Hive 2.1 was released about a month ago and it’s a great opportunity to review how Hive 2 is drastically changing the landscape for SQL on Hadoop. There is so much new in Hive it’s hard to pick highlights, but here are a few: Interactive query with Hive LLAP. LLAP was introduced in Hive […]

The first decade is over and we’re entering the second. One industry watcher makes a great point: Awkward teenage years ahead? I don’t believe we’ll be one of those ‘difficult’ teenagers. We might be a bit of a nerd, but we’ll be the well balanced one. The one with friends, the one that goes to […]

The most significant new feature in Apache Hive 2, to be included in the upcoming HDP 2.5 release is a technical preview of LLAP (Live Long and Process). LLAP enables as fast as sub-second SQL analytics on Hadoop by intelligently caching data in memory with persistent servers that instantly process SQL queries. Since LLAP is […]

The below blog has been co-authored by Vinay Shukla, Hortonworks, Moon So Lee, Apache Zeppelin PMC & NFLabs, Prabhjyot Singh, Apache Zeppelin PMC & Hortonworks” Recently the Apache Software Foundation (ASF) announced Apache Zeppelin as a top level project. This was a great milestone for both the Zeppelin and data science community. Since its’ incubation in […]

The world’s top authorities on Apache Hadoop convene at Hadoop Summit San Jose and one of the top questions that will be answered will be around the future and direction of Hadoop. Sanjay Radia – Founder and Architect, Hortonworks lead the track which selected 13 sessions around this topic. I asked Sanjay what he hoped would […]

Hadoop Summit San Jose, is just around the corner. I am amazed at the depth and breadth of the technical sessions and was looking at the Application Development track: Application Development YARN has transformed Hadoop into a multi-tenant data platform. It is the foundation for a wide range of processing engines that empowers businesses to […]

Apache Hadoop® exists within a broader ecosystem of enterprise analytical packages. This includes ETL tools, ERP and CRM systems, enterprise data warehouses, data marts and others. Modern workloads flow from these various traditional analytical sources into Hadoop and then often back out again. What dataset came from which system, when and how did it change over […]

Another busy week on Hortonworks Community Connection, here is the hot content for this week (based on community activity and votes): Top 3 articles this week: (or see the whole list here) Map Hive jobs to YARN queues Using a Hive hook to map jobs to YARN queues when using hive.server2.enable.doAs = false (security best practices and […]

Another busy week on Hortonworks Community Connection, here is the hot content for this week (based on community activity and votes): Top 3 articles this week: (or see the whole list here) Quickly enable SSL encryption for Hadoop components in HDP Sandbox Nice collection of scripts and advice on using Sandbox with SSL.  Hive on Tez Performance Tuning […]