The Hortonworks Blog

Posts categorized by : Other

This is the first of two posts examining the use of Hive for interaction with HBase tables. The second post is here.

One of the things I’m frequently asked about is how to use HBase from Apache Hive. Not just how to do it, but what works, how well it works, and how to make good use of it. I’ve done a bit of research in this area, so hopefully this will be useful to someone besides myself.…

The Apache Knox community announced the release of the Apache Knox Gateway (Incubator) 0.3.0. We, at Hortonworks, are excited about this announcement.

The Apache Knox Gateway is a REST API Gateway for Hadoop with a focus on enterprise security integration.  It provides a simple and extensible model for securing access to Hadoop core and ecosystem REST APIs.

Apache Knox provides pluggable authentication to LDAP and trusted identity providers as well as service level authorization and more.  …

It’s been a huge couple of weeks for us at Hortonworks HQ. We’ve talked about the GA of Hadoop 2, the subsequent release of Hortonworks Data Platform 2.0, and a little of the future with Apache Storm. We’ve been staggered by the support, goodwill and enthusiasm we’ve seen from you all.

We hope you’re as excited about Hadoop as we are, and we wanted to say thanks to our amazing team, amazing customers, amazing partners and the most amazing community for doing Hadoop with us – THANK YOU.…

Typical delivery of enterprise software involves a very controlled date with a secret roadmap designed to wow prospects, customers, press and analysts…or at least that is the way it usually works.  Open source, however, changes this equation.

As described here, the vision for extending Hadoop beyond its batch-only roots in support of interactive and real-time workloads was set by Arun Murthy back in 2008. The initiation of YARN, the key technology for enabling this vision, started in earnest in 2011, was declared GA by the community in the recent Apache Hadoop 2.2 release, and is now delivered for mainstream enterprises and the broader commercial ecosystem with the release of Hortonworks Data Platform 2.0.…

Today we are proud to announce the delivery of Apache Ambari 1.4.1. Ambari 1.4.1 combines many months of work in the community advancing the Ambari codebase. Over 760 JIRAs have been resolved since the Ambari 1.2.5 release. We would like to thank the nearly 40 engineers who contributed to help make this release possible.

Hello Hadoop 2, Meet Apache Ambari The most important addition to Ambari 1.4.1 is support for installing, managing and monitoring a cluster based on the Hadoop 2 stack.…

The Hortonworks HBase team is excited to see HBase 96 released.  It represents a broad community effort and massive amount of work that has been building for more than a year.

HBase 96 closes out over 2000 issues (2134 Jira tickets to be exact) and it represented the collective work from a VERY active community. Kudos to everyone involved! As the authors in a recent Apache blog alluded to, the HBase community is very healthy and includes developers from many companies including Hortonworks, Yahoo!, Cloudera, Salesforce, eBay, Intel, and Facebook, just to name just a few.…

Today we announced the Analytics Advantage with Hadoop offering from SAS, Teradata and Hortonworks. The new offering leverages the capabilities for in-database data preparation, analytic model building and deployment and combines Teradata’s Appliance for SAS® High-Performance Analytics offering with the Teradata Appliance for Hadoop built on Hortonworks Data Platform. Using Teradata’s Unified Data Architecture (UDA), this high-speed integrated offering allows customers to discover, build and deploy analytic models across data stored in Teradata and Hadoop, promoting businesses’ ability to act upon analytic insights from any type of data across a seamless environment and faster than ever before.…

Stinger is not a product.  Stinger is a broad community based initiative to bring interactive query at petabyte scale to Hadoop. And today, as representatives of this open, community led effort we are very proud to announce delivery of Apache Hive 0.12, which represents the critical second phase of this project!

Only five months in the making, Apache Hive 0.12 comprises over 420 closed JIRA tickets contributed by ten companies, with nearly 150 thousand lines of code! …

Security is one of the biggest topics in Hadoop right now. Historically Hadoop has been a back-end system accessed only by a few specialists, but the clear trend is for companies to put data from Hadoop clusters in the hands of analysts, marketers, product managers or call center employees whose numbers could be in the hundreds or thousands. Data security and privacy controls are necessary before this transformation can occur. HDP2, through the next release of Apache Hive introduces a very important new security feature that allows you to encrypt the traffic that flows between Hadoop and popular analytics tools like Microstrategy, Tableau, Excel and others.…

This post is the fourth in our series on the motivations, architecture and performance gains of Apache Tez for data processing in Hadoop. The series has the following posts:

The previous couple of blogs covered Tez concepts and APIs.…

Are you a Hadoop hot shot?  Are you the one everyone looks to for help on their Hadoop projects? Are you looking to showcase your talent to the world?

Then just maybe we have a great option for you. We recently published the Hortonworks Sandbox tutorials on GitHub. Now it’s your turn. We invite you to add your own Hadoop tutorials or to improve on the ones that we’ve published.…

We’ve been hosting a series of webinars focusing on how to make Apache Hadoop a viable enterprise platform that powers modern data architectures.

Implementing modern data architecture with Hadoop means that it must deeply integrate with existing technologies, leverage existing skills and investments and provide key services. This guest post from David Smith, Vice President of Marketing and Community at Revolution Analytics, shares his perspective on the role of a Data Scientists in a Big Data world.…

Just a couple of weeks ago we published our simple SQL to Hive Cheat Sheet. That has proven immensely popular with a lot of folk to understand the basics of querying with Hive.  Our friends at Qubole were kind enough to work with us to extend and enhance the original cheat sheet with more advanced features of Hive: User Defined Functions (UDF). In this post, Gil Allouche of Qubole takes us from the basics of Hive through to getting started with more advanced uses, which we’ve compiled into another cheat sheet you can download here.…

He loves me, he loves me not… using daisies to figure out someone’s feelings is so last century. A much better way to determine whether someone likes you, your product or your company is to do some analysis on Twitter feeds to get better data on what the public is saying. But how do you take thousands of tweets and process them?  We show you how in our video — Understand your customers’ sentiments with Social Media Data — that you can capture a Twitter stream to do Sentiment Analysis.…

If you are an enterprise, chances are you use SAP.  And you are also more than likely using – or planning to use – Hadoop in your data architecture.

Today, we are delighted to announce the next step in our strategic relationship with SAP as they announce a reseller agreement with Hortonworks.  Under this agreement, SAP will resell Hortonworks Data Platform and provide enterprise support for their global customer base.  This will enable SAP customers to implement a data architecture that includes SAP HANA and the Hortonworks Data Platform and in so doing leverage existing skills to take advantage of the massive scalability and performance offered by Apache Hadoop.…

Go to page:12345...Last »