Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
August 18, 2017 | Tom Hastain | Hortonworks Case Study

Worldpay: Influencing Open Source for Enterprise Readiness via Hortonworks Support

August 17, 2017 | Syed Mahmood

What is a Data Science Workbench and Why Do Data Scientists Need One?

August 17, 2017 | Casey Stella

Model as Service: Modern Streaming Data Science with Apache Metron

Viewing posts by: Carter Shanklin« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

This is part 1 of a 2 part series for how to update Hive Tables the easy way Historically, keeping data up-to-date in Apache Hive required custom application development that is complex, non-performant and difficult to maintain. HDP 2.6 radically simplifies data maintenance with the introduction of SQL MERGE in Hive, complementing existing INSERT, UPDATE […]

Hive / Druid integration means Druid is BI-ready from your tool of choice This is Part 3 of a Three-Part series of doing ultra fast OLAP Analytics with Apache Hive and Druid. Connect Tableau to Druid Previously we talked about how the Hive/Druid integration delivers screaming-fast analytics, but there is another, even more powerful benefit to […]

THIS IS PART 2 OF A THREE-PART SERIES OF DOING ULTRA FAST OLAP ANALYTICS WITH APACHE HIVE AND DRUID Modern corporations are increasingly looking for near real time analytics and insights to make actionable decisions.   To help organizations understand more about the benefits of Apache Hive and Druid, we will focus on how you can […]

This is part 1 of a three-part series of doing Ultra Fast OLAP Analytics with Apache Hive and Druid. Unlock Sub-Second SQL Analytics over Terabytes of Data with Hive and Druid Modern corporations are increasingly looking for near real time analytics and insights to make actionable decisions. To fuel this, this blog introduces Ultra fast […]

Hive View 2.0 is New in Apache Ambari 2.5 Ambari’s Hive View gives analysts and DBAs a convenient web interface to Apache Hive which allows SQL analytics, data management and performance diagnostics. Ambari 2.5 introduces Hive View 2.0 with a brand new user experience plus a slew of great new tools to help DBAs run […]

HDP 2.6 takes a huge step forward toward true data management by introducing SQL-standard ACID Merge to Apache Hive. As scalable as Apache Hadoop is, many workloads don’t work well in the Hadoop environment because they need frequent or unpredictable updates. Updates using hand-written Apache Hive or Apache Spark jobs are extremely complex.  Not only […]

Now Generally Available in HDP 2.6 Hive LLAP (Low Latency Analytical Processing) is Hive’s new architecture that delivers MPP performance at Hadoop scale through a combination of optimized in-memory caching and persistent query executors that scale elastically within YARN clusters. Hive LLAP — MPP Performance at Hadoop Scale   Since Hive LLAP was introduced as […]

The 100% open source and community driven innovation of Apache Hive 2.0 and LLAP (Long Last and Process) truly brings agile analytics to the next level. It enables customers to perform sub-second interactive queries without the need for additional SQL-based analytical tools, enabling rapid analytical iterations and providing significant time-to-value. TRY HIVE LLAP TODAY Read about […]

Apache Hive(™) is the most complete SQL on Hadoop system, supporting comprehensive SQL, a sophisticated cost-based optimizer, ACID transactions and fine-grained dynamic security. Though Hive has proven itself on multi-petabyte datasets spanning thousands of nodes many interesting use cases demand more interactive performance on smaller datasets, requiring a shift to in-memory. Hive 2 marks the […]

The need to address Business Continuity and Disaster Recovery (BCDR) concerns is well known to anyone who runs production systems. This blog introduces HBase’s new backup and restore capabilities, which give HBase the ability to perform full and incremental backups across clusters and into the cloud. When combined with real-time replication, this new incremental backup […]

The most significant new feature in Apache Hive 2, to be included in the upcoming HDP 2.5 release is a technical preview of LLAP (Live Long and Process). LLAP enables as fast as sub-second SQL analytics on Hadoop by intelligently caching data in memory with persistent servers that instantly process SQL queries. Since LLAP is […]

Are you heading to HBaseCon this year on May 24? This year HBaseCon just had too much great content to fit it all into one day, and thanks to the kind sponsorship of Salesforce we’re happy to announce that PhoenixCon, the first ever Apache Phoenix user conference will be held on the next day, May […]

Apache Ambari 2.0 User Views introduce two functional tools to help you understand and optimize your cluster resources to get the best performance in a multitenant Hadoop environment. Tez View: Understand and Optimize Jobs in your Cluster The Tez View gives you visibility into all the jobs on your cluster, allowing you to quickly identify […]

Summary This blog covers how recent developments have made it easy to use ORCFile from Cascading or Apache Crunch and that doing so can accelerate data processing more than 5x. Code samples are provided so that you can start integrating ORCFile into your Cascading or Crunch projects today. What are Cascading and Apache Crunch? Cascading […]

Introduced in 2008, Apache Hive has been the de-facto SQL solution in Hadoop. By 2012, SQL had become a key battleground for Hadoop and many vendors started to publish benchmarks showing massive performance advantages their solutions had over Hive. Each of these vendors predicted that Hive would eventually be supplanted by the proprietary solution they […]