The Hortonworks Blog

Posts categorized by : Big Data

A lot of people ask me: how do I become a data scientist? I think the short answer is: as with any technical role, it isn’t necessarily easy or quick, but if you’re smart, committed and willing to invest in learning and experimentation, then of course you can do it.

In a previous post, I described my view on “What is a data scientist?”: it’s a hybrid role that combines the “applied scientist” with the “data engineer”. …

‘The world is being digitized’ proclaimed Geoffrey Moore in his keynote at Hadoop Summit 2012 over a year ago. His belief is that we are moving away from an analog society where we collect only casual recording of events to one that is digital, where everything is captured. It is our belief that Hadoop is one of the key technologies powering this shift to a digital society.

There is almost an expectation that we capture the pics, vids and conversations that run before us. …

We’ve been hosting a series of webinars focusing on how to make Apache Hadoop a viable enterprise platform that powers modern data architectures.

Implementing modern data architecture with Hadoop means that it must deeply integrate with existing technologies, leverage existing skills and investments and provide key services. This guest post from David Smith, Vice President of Marketing and Community at Revolution Analytics, shares his perspective on the role of a Data Scientists in a Big Data world.…

How big is big anyway? What sort of size and shape does a Hadoop cluster take?

These are great questions as you begin to plan a Hadoop implementation. Designing and sizing a cluster is complex and something our technical teams spend a lot of time working with customers on: from storage size to growth rates, from compression rates to cooling then there are many factors to take into account.

To make that a little more fun, we’ve built a cluster-size-o-tron which performs a more simplistic calculation based on some assumptions on node sizes and data payloads to give an indication of how big your particular big is.…

Syncsort, a technology partner with Hortonworks, helps organizations propel Hadoop projects with a tool that makes it easy to “Collect, Process and Distribute” data with Hadoop. This process, often called ETL (Exchange, Transform, Load), is one of the key drivers for Hadoop initiatives; but why is this technology a key enabler of Hadoop? To find out the answer we talked with Syncsort’s Director Of Strategy, Steve Totman, a 15 year veteran of data integration and warehousing, provided his perspective on Data Warehouse Staging Areas.…

If you are an enterprise, chances are you use SAP.  And you are also more than likely using – or planning to use – Hadoop in your data architecture.

Today, we are delighted to announce the next step in our strategic relationship with SAP as they announce a reseller agreement with Hortonworks.  Under this agreement, SAP will resell Hortonworks Data Platform and provide enterprise support for their global customer base.  This will enable SAP customers to implement a data architecture that includes SAP HANA and the Hortonworks Data Platform and in so doing leverage existing skills to take advantage of the massive scalability and performance offered by Apache Hadoop.…

Think Big Analytics, a Hortonworks systems integration partner has been helping customers navigate the complex world of Hadoop successfully for the past three years.  Over the years they have seen it all and have developed one of the most mature Hadoop implementation methodologies known.  Recently, we asked Ron Bodkin, Founder and CEO of Think Big Analytics to share some insight.

What are the “Must-Dos” Before Starting a Big Data Project?…

The shift to a data-oriented business is happening. The inherent value in established and emerging big datasets is becoming clear. Enterprises are building big data strategies to take advantage of these new opportunities and Hadoop is the platform to realize those strategies.

Hadoop is enabling a modern data architecture where it plays a central role: built to tackle big data sets with efficiency while integrating with existing data systems. As champions of Hadoop, our aim is to ensure the success of every Hadoop implementation and improve our own understanding of how and why enterprises tackle big data initiatives. …

Our Systems Integrator partner, Knowledgent, is hosting a Big Data Immersion Class geared towards technologists who are tasked with launching Big Data programs that must have tangible real-time benefits to their organizations.

“When and how do I use these new big data technologies?” “How do I operationalize them in my environment?” These are some of the fundamental questions that Knowledgent prospects and customers are asking and why the 3 day immersion class was developed.…

This week, we announced the launch of Hortonworks Data Platform (HDP) 1.3 for Windows which brings our native Windows Hadoop distribution to parity with our Linux distribution. HDP for Windows is also the Hadoop foundation for Microsoft’s HDInsight Service which delivers Hadoop and BI capabilities in in the Azure cloud.

Impetus, a Hortonworks System Integrator partner, is an early adopter of the Hortonworks Data Platform (HDP) and has leveraged the combined power of Hadoop & Microsoft Azure platform for a number of successful big data implementations using Microsoft’s HDInsight Service.…

This guest post from Sofia Parfenovich, Data Scientist at Altoros Systems, a big data specialist and a Hortonworks System Integrator partner. Sofia explains she optimized a customer’s trading solution by using Hadoop (Hortonworks Data Platform) and by clustering stock data.

Automated trading solutions are widely used by investors, banks, funds, and other stock market players. These systems are based on complex mathematical algorithms and can take into account hundreds of factors.…

Extracting insight from your machines, or customer sentiment data or any number of scenarios related to big data demands the integration of Hadoop into your data architecture to efficiently handle those new opportunities alongside the existing workloads. Over the next few months, we’re hosting a new webinar series along with partners to get to grips with what it means to integrate Hadoop into your data architecture.

The first three webinars in the series are listed below and ready for registration.…

If you’re considering the WHY, the HOW and the WHAT of Hadoop and Big Data in your business, then this collection of papers and ebooks is your friend.

  • WHY does Hadoop matter? Our eBook “Disruptive Possibilities of Big Data” paints a picture of the future of the data-driven business and how it changes everything.
  • HOW does Hadoop work in my data architecture? As part of a modern data architecture, Hadoop sits alongside existing infrastructure and augments its capabilities through Refining and Exploring big datasets and ultimately enriching the application and customer experiences for your business.

By now, you’re probably well aware of what Hadoop does:  low-cost processing of huge amounts of data. But more importantly, what can Hadoop do for you?

We work with many customers across many industries with many different specific data challenges, but in talking to so many customers, we are also able to see patterns emerge on certain types of data and the value that could bring to a business.

We love to share these kinds of insights, so we built a series of video tutorials covering some of those scenarios:

Some more detailed discussion of these types of data is in our ‘Business Value of Hadoop’ whitepaper.…

What is the value of Hadoop to your business? What value lies in your big data?

There are a MANY definitions of big data out there.  In fact, we have published two of them to our blog alone and I am sure we can dream up of a few more.  However, when it comes down to it, our customers know best.  After all, they are the users of Hadoop.

New Whitepaper: “Business Value of Hadoop”.…

Go to page:12345...Last »