The Hortonworks Blog

Posts categorized by : Hadoop in the Enterprise

Today, we are pleased to announce our strategic alliance between Hortonworks and SAS. Through this alliance we are committing to expand the integration between the SAS business analytics and data management capabilities and the Hortonworks Data Platform (HDP).

By better integrating SAS Business Analytics and HDP, SAS users can easily incorporate Hadoop as a component of their data architecture to capture, process and analyze data of any type and scale. This allows businesses to leverage powerful SAS analytic and data management capabilities across massive data sets, including new data sources that previously could not be captured and analyzed.…

Designed for senior IT executives, IT architects, technology planners, and business technologists, Knowledgent’s three-day facilitated Big Data Immersion workshop recently held in New York City, provided participants with an intensive deep dive answering the big data questions:

  • Why Big Data? What are the issues that brought it all about?
  • Demystifying Big Data: How can Hadoop help with big data issues?
  • Implementation: How do I operationalize big data? How is big data analytics different?

On October 16, we’ve been invited to join our partner SAP to talk Big Data and how the integrated SAP HANA + Hadoop approach can solve your big data challenges. This chat will be a live Google Hangout with:

  • Irfan Khan, SVP & GM SAP Global Big Data at SAP (@i_kHANA)
  • Ari Zilka,  CTO at Hortonworks (@ikarzali)
  • Timo Elliot, Innovation Evangelist at SAP (@timoelliott)

When: Wednesday, October 16, 8am PT / 11am ET / 5pm CET…

This is a guest blog post from our partner, Actuate. They’ve been generous enough to create some great Hadoop tutorials on the Open Source BIRT project that use the Hortonworks Sandbox.

By now, Apache™ Hadoop® has become synonymous with the first stage of Big Data: storing, processing and managing huge volumes and varieties of structured and unstructured data. Yet the data stored by Hadoop remains unreadable to the average business user.…

Thanks to all those who joined in person and virtually for the Apache Ambari Meetup at Hortonworks this week. We talked tech, we saw demos, we laughed, we cried, we ate pizza.

The central theme of the night was the newly added support for Hadoop 2. Ambari now has:

  • Hadoop 2 Stack: Ambari adds support for installing, managing and monitoring a Hadoop 2 Stack.
  • NameNode HA: Configure NameNode High Availability based on QJM support built-into HDFS2
  • YARN: Ambari manages YARN Service lifecycle and automatically deploys the MapReduce2 framework.

A lot of people ask me: how do I become a data scientist? I think the short answer is: as with any technical role, it isn’t necessarily easy or quick, but if you’re smart, committed and willing to invest in learning and experimentation, then of course you can do it.

In a previous post, I described my view on “What is a data scientist?”: it’s a hybrid role that combines the “applied scientist” with the “data engineer”. …

‘The world is being digitized’ proclaimed Geoffrey Moore in his keynote at Hadoop Summit 2012 over a year ago. His belief is that we are moving away from an analog society where we collect only casual recording of events to one that is digital, where everything is captured. It is our belief that Hadoop is one of the key technologies powering this shift to a digital society.

There is almost an expectation that we capture the pics, vids and conversations that run before us. …

We’ve been hosting a series of webinars focusing on how to make Apache Hadoop a viable enterprise platform that powers modern data architectures.

Implementing modern data architecture with Hadoop means that it must deeply integrate with existing technologies, leverage existing skills and investments and provide key services. This guest post from David Smith, Vice President of Marketing and Community at Revolution Analytics, shares his perspective on the role of a Data Scientists in a Big Data world.…

In March of 2013 we announced our plans to enter the European market and just six months we have not only landed but also are expanding and operating across Europe with field teams in UK, France and Germany.  Those teams are growing and, more importantly, our customer base is expanding.

What would expansion be without customers?

European customers are actively looking for solutions that enable the processing and analysis of large quantities of data, and Apache Hadoop is meeting those needs.  …

How big is big anyway? What sort of size and shape does a Hadoop cluster take?

These are great questions as you begin to plan a Hadoop implementation. Designing and sizing a cluster is complex and something our technical teams spend a lot of time working with customers on: from storage size to growth rates, from compression rates to cooling then there are many factors to take into account.

To make that a little more fun, we’ve built a cluster-size-o-tron which performs a more simplistic calculation based on some assumptions on node sizes and data payloads to give an indication of how big your particular big is.…

Just a couple of weeks ago we published our simple SQL to Hive Cheat Sheet. That has proven immensely popular with a lot of folk to understand the basics of querying with Hive.  Our friends at Qubole were kind enough to work with us to extend and enhance the original cheat sheet with more advanced features of Hive: User Defined Functions (UDF). In this post, Gil Allouche of Qubole takes us from the basics of Hive through to getting started with more advanced uses, which we’ve compiled into another cheat sheet you can download here.…

Syncsort, a technology partner with Hortonworks, helps organizations propel Hadoop projects with a tool that makes it easy to “Collect, Process and Distribute” data with Hadoop. This process, often called ETL (Exchange, Transform, Load), is one of the key drivers for Hadoop initiatives; but why is this technology a key enabler of Hadoop? To find out the answer we talked with Syncsort’s Director Of Strategy, Steve Totman, a 15 year veteran of data integration and warehousing, provided his perspective on Data Warehouse Staging Areas.…

If you are an enterprise, chances are you use SAP.  And you are also more than likely using – or planning to use – Hadoop in your data architecture.

Today, we are delighted to announce the next step in our strategic relationship with SAP as they announce a reseller agreement with Hortonworks.  Under this agreement, SAP will resell Hortonworks Data Platform and provide enterprise support for their global customer base.  This will enable SAP customers to implement a data architecture that includes SAP HANA and the Hortonworks Data Platform and in so doing leverage existing skills to take advantage of the massive scalability and performance offered by Apache Hadoop.…

Building a modern data architecture with Hadoop delivering high-scale and low-cost data processing means integrating Hadoop effectively inside the data center. For this post, we asked Yves de Montcheuil, VP of Marketing at Talend about his customers’ experiences with Hadoop integration. Here’s what he had to say:

Most organizations are still in the early stages of big data adoption, and few have thought beyond the technology angle of how big data will profoundly impact their processes and their information architecture.…

Think Big Analytics, a Hortonworks systems integration partner has been helping customers navigate the complex world of Hadoop successfully for the past three years.  Over the years they have seen it all and have developed one of the most mature Hadoop implementation methodologies known.  Recently, we asked Ron Bodkin, Founder and CEO of Think Big Analytics to share some insight.…

What are the “Must-Dos” Before Starting a Big Data Project?

Go to page:« First...678910...Last »