The Hortonworks Blog

Our Systems Integrator partner, Knowledgent, is hosting a Big Data Immersion Class geared towards technologists who are tasked with launching Big Data programs that must have tangible real-time benefits to their organizations.

“When and how do I use these new big data technologies?” “How do I operationalize them in my environment?” These are some of the fundamental questions that Knowledgent prospects and customers are asking and why the 3 day immersion class was developed.…

This post authored by Zhijie Shen with Vinod Kumar Vavilapalli.

This is the sixth blog in the multi-part series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series:

Introducing Apache Hadoop YARN
Apache Hadoop YARN – Background and an Overview
Apache Hadoop YARN – Concepts and Applications
Apache Hadoop YARN – ResourceManager
Apache Hadoop YARN – NodeManager

Introduction

The beta release of Apache Hadoop  2.x has finally arrived and we are striving hard to make the release easy to adopt with no or minimal pain to our existing users.…

Chances are you’ve already used Tableau Software if you’ve been involved with data analysis and visualization solutions for any length of time. Tableau 6.1.4 introduced the ability to visualize large, complex data stored in Hadoop with Hortonworks Data Platform via Hive and the Hortonworks Hive ODBC driver.

If you want to get hands on with Tableau as quickly as possible, we recommend using the Hortonworks Sandbox and the ‘Visualize Data with Tableau’ tutorial.…

It’s my great pleasure to announce that the Apache Hadoop community has declared Hadoop 2.x as Beta with the vote closing over the weekend for the hadoop-2.1.0-beta release.

As noted in the announcement to the mailing lists, this is a significant milestone across multiple dimensions: not only is the release chock-full of significant features (see below), it also represents a very stable set of APIs and protocols on which we can continue to build for the future.…

As summer comes to a close, we bid a fond farewell (again!) to our excellent marketing intern, Tanya Maslyanko. Tanya has been a terrific help to us with her can-do attitude and marketing intuition so the tears we shed are because we’ll miss our friend and because we’ll have to start doing our own work again. Over to Tanya…

A few years ago, I sat in a freshman-filled auditorium at my university’s orientation listening to successful graduates talk about how important it was to get involved with your career early on.…

There are myriad of use cases for Big Data applications across industries. For example, financial companies want to analyze Governance to assess levels of risk and compliance.  Transportation companies want to analyze overall logistics for optimization.  Oil and Gas companies supplying energy want to predict machine failings to reduce risks of outages. Insurance companies will need to analyze actuarial information in order to calculate individual policy premiums – yes, the impending Affordable Care Act.…

The next in our series of quick interviews with Apache Hadoop project committers at Hortonworks.

In this video, we talk with Sanjay Radia, Hortonworks co-founder and Apache Hadoop committer, about the initiation of HDFS, the cost benefits it brings to data storage and future directions for the project.

Learn more about HDFS here or at the Apache Hadoop project site.

Before I was a developer of Hadoop, I was a user of Hadoop.  I was responsible for operation and maintenance of multiple Hadoop clusters, so it’s very satisfying when I get the opportunity to implement features that make life easier for operations staff.

Have you ever wondered what’s happening during a namenode restart?  A new feature coming in HDP 2.0 will give operators greater visibility into this critical process.  This is a feature that would have been very useful to me in my prior role.…

UPDATE: This cheat sheet was so popular, we’ve created a PDF of the content below so you can print it and use it more easily. Download here.

 

If you’re already familiar with SQL then you may well be thinking about how to add Hadoop skills to your toolbelt as an option for data processing.

From a querying perspective, using Apache Hive provides a familiar interface to data held in a Hadoop cluster and is a great way to get started.…

If you want to understand the thinking in the various projects in the Hadoop ecosystem, then who better to talk to than key members of those projects – the committers.

In this video, we talk with Owen O’Malley, Hortonworks co-founder and Apache Hive committer, about the initiation of Hive, why it matters and future directions for the project.

Learn more about Hive here, or at the Apache Hive project site.…

Dan Rosanova is a Senior Architect at West Monroe Partners, a Hortonworks System Integrator and our guest blogger.

With the release last week of Hortonworks Data Platform (HDP) 1.3 for Windows the Big Data ecosystem takes a large step forward to broad adoption in enterprise environments.  As a systems integrator at West Monroe Partners, I work with medium and large enterprises as they address their technology challenges on a daily basis. …

A busy week at Hortonworks Towers means a quick recap on what’s been happening.

Hadoop on Windows. On Tuesday we announced the GA of HDP 1.3 for Windows. Apart from being the only native Windows distribution for Hadoop, the updates and innovation in this release bring it to parity with our Linux distribution which means Hadoop Everywhere! Later on, we talked about getting started with HDP 1.3 for Windows, and also pointed at some great resources and tutorials.…

This week, we announced the launch of Hortonworks Data Platform (HDP) 1.3 for Windows which brings our native Windows Hadoop distribution to parity with our Linux distribution. HDP for Windows is also the Hadoop foundation for Microsoft’s HDInsight Service which delivers Hadoop and BI capabilities in in the Azure cloud.

Impetus, a Hortonworks System Integrator partner, is an early adopter of the Hortonworks Data Platform (HDP) and has leveraged the combined power of Hadoop & Microsoft Azure platform for a number of successful big data implementations using Microsoft’s HDInsight Service.…

This guest post from Sofia Parfenovich, Data Scientist at Altoros Systems, a big data specialist and a Hortonworks System Integrator partner. Sofia explains she optimized a customer’s trading solution by using Hadoop (Hortonworks Data Platform) and by clustering stock data.

Automated trading solutions are widely used by investors, banks, funds, and other stock market players. These systems are based on complex mathematical algorithms and can take into account hundreds of factors.…

If you’re a Microsoft developer and stepping into Hadoop for the first time with HDP for Windows, then we thought we’d highlight this fantastic resource from Rob Kerr, Chris Campbell and Garrett Edmondson :  the MSBIAcademy.

They’ve produced a high quality, practical series of videos covering anything from essential MapReduce concepts, to using .NET (in this case C#) to submit MapReduce jobs to HDInsight, to using Apache Pig for Web Log Analysis.…

Go to page:« First...1213141516...2030...Last »

Thank you for subscribing!