The Hortonworks Blog

The year is coming to its end. Maybe you’re reading this as you race to check a few more 2013 items off of your to-do list (at work or at home). Or maybe you’ve already got a hot toddy in your hand and your feet kicked up, with slippers warming your toes.

In 2013, I have been fortunate enough to spend the year speaking with our customers and I learned about how so many important organizations are using Apache Hadoop and Hortonworks Data Platform (HDP) to solve real problems.…

The network and security teams at your company do not allow internet access from the machines where you plan to install Hadoop. What do you do? How do you install your Hadoop cluster without having access to the public software packages? Apache Ambari supports local repositories and in this post we’ll look at the configuration needed for that support.

When installing Hadoop with Ambari, there are three repositories at play: one for Ambari – which primarily hosts the Ambari Server and Ambari Agent packages) and two repositories for the Hortonworks Data Platform – which hosts the HDP Hadoop Stack packages and other related utilities.…

Congrats to our partner, Revolution Analytics, on the general availability of Revolution R Enterprise 7 (RRE7). With this release, you can now run R natively in Hortonworks Data Platform by simply moving their R-powered analytics to Hadoop. Users will be able to run the high-performance distributed R functions in Revolution R Enterprise without having to move the data out of Hadoop, and using the Hadoop nodes as a parallel computation grid.…

Update! – The final phase of improvements from the Stinger Initiative were released as part of Hive 0.13 on Apr 21, 2014 – Read the announcement

While just a preview by moniker, the release marks a significant milestone in the transformation of Hadoop from a batch-oriented system to a data platform capable of interactive data processing at scale and delivering on the aims of the Stinger Initiative.

Apache Tez and SQL: Interactive Query-IN-Hadoop

Tez is a low-level runtime engine not aimed directly at data analysts or data scientists.…

Encryption is applied to electronic information in order to ensure its privacy and confidentiality.  Typically, we think of protecting data as it rests or in motion.  Wire Encryption protects the latter as data moves through Hadoop over RPC, HTTP, Data Transfer Protocol (DTP), and JDBC.

Let’s cover the configuration required to encrypt each of these protocols. To see the step-by-step instructions please see the HDP 2.0 documentation.

RPC Encryption

The most common way for a client to interact with a Hadoop cluster is through RPC.  …

Last week was a busy week for shipping code, so here’s a quick recap on the new stuff to keep you busy over the holiday season.

Step one in the development of the Hadoop Summit Europe content tracks is complete!  Thank you to everyone who participated in the Hadoop Summit Community Choice voting process. We counted over 14,000 votes, setting a new record for participation in this program. The turnout far exceeded our expectations and it is terrific that the momentum behind Apache Hadoop continues to go from strength to strength… especially in Europe!

Before we announce the winners….…

Apache Hadoop has always been very fussy about Java versions. It’s a big application running across tens of thousands of processes across thousands of machines in a single datacenter. This makes it almost inevitable that any race conditions and deadlock bugs in the code will eventually surface – be it in the Java JVM and libraries, in Hadoop itself, or in one of the libraries on which it depends.

Hence the phrase “there are no corner cases in a datacenter”.…

There is a lot of information available on the benefits of Apache YARN but how do you get started building applications? On December 18 at 9am Pacific Time, Hortonworks will host a webinar and go over just that:  what independent software vendors (ISVs) and developers need to do to take the first steps towards developing applications or integrating existing applications on YARN.

Register for the webinar here.

Why YARN?

As Hadoop gains momentum it’s important to recognize the benefits to customers and the competitive advantage software vendors will have if their application is integrated with YARN like elasticity, reliability and efficiency.…

In October, we announced our intent to include and support Storm as part of Hortonworks Data Platform. With this commitment, we also outlined and proposed an open roadmap to improve the enterprise readiness of this key project.  We are committed to doing this with a 100% open source approach and your feedback is immensely valuable in this process.

Today, we invite you to take a look at our Storm technical preview.…

In God we trust, all others must bring data.
Dr. W. Edwards Deming
Dr. W. Edwards Deming was a statistician and manufacturing consultant who worked on Japanese reconstruction after WWII. His quality control methods influenced innovative Japanese manufacturing processes that simultaneously increased volume, reduced cost, and improved quality. Near the end of his career, Deming taught the same lessons to U.S. automakers.

To this day, the “Deming Prize” is one of the highest rewards for Total Quality Management in the world.…

It’s been an amazing year of expansion for the Hadoop ecosystem. If you’re looking to use Hadoop in your infrastructure, see how these hundreds of amazing partners can help. If you would like to become a partner, come talk to us – we’d love to have you on-board.

The Hortonworks partner program has had a terrific year across many different measures not the least of which is the fact that the Hortonworks partner community grew by more than 240 percent.…

Apache Sqoop is a tool that transfers data between the Hadoop ecosystem and enterprise data stores. Sqoop does this by providing methods to transfer data to HDFS or Hive (using HCatalog). Oracle Database is one of the databases supported by Apache Sqoop. With Oracle Database, the database connection credentials are stored in Oracle Wallet. Oracle Wallet can act as the store of keys and secrets such as authentication credentials. This post describes how Oracle Wallet adds a secure authentication layer for Sqoop jobs.…

Just yesterday, we talked about our roadmap for Security in Enterprise Hadoop. At our Security labs page you can see in one place the security roadmap and efforts underway across Hadoop and their timelines.

Security is often described as rings of defense. Continuing this analogy the Apache community has been working to create a perimeter security solution for Hadoop. This effort is Apache Knox Gateway (Apache Knox) and we are happy to announce the Technical Preview of Apache Knox.…

2013 was certainly a revealing year for the Enterprise Hadoop market. We witnessed the emergence of the YARN-based architecture of Hadoop 2 and a strong ecosystem embracement that will fuel its next big wave of innovation. The analyst community accurately predicted Hadoop’s market momentum would greatly accelerate, but none predicted a pure play vendor would publicly declare its intent to pivot away from the Enterprise Hadoop market. Interesting times indeed!

Join us on Tuesday January 21st where we’ll be covering the Enterprise Hadoop State of the Union in more detail.…

Go to page:« First...910111213...203040...Last »