The Hortonworks Blog

Hadoop jobs have grown 200,000%. No, that’s not a typo. According to Indeed.com, Hadoop is one of the top 10 job trends right now.

When you look at LinkedIn, the growth in profiles that have SQL in them is on the downswing — about -4%, but the growth of profiles that have Hadoop in them is up 37%. Hadoop is becoming a clear resume differentiator. Updating and maintaining technical skills has always been part of the job and is part of ensuring a long and healthy career.…

Whether only beginning or well underway with Big Data initiatives, organizations need data protection to mitigate risk of breach, assure global regulatory compliance and deliver the performance and scale to adapt to the fast-changing ecosystem of Apache Hadoop tools and technology.

Business insights from big data analytics promise major benefits to enterprises – but launch of these initiatives also presents potential risks. New architectures, including Hadoop, can aggregate different types of data in structured, semi-structured and unstructured forms, perform parallel computations on large datasets, and continuously feed the data lake that enable data scientists to see patterns and trends.…

Airline pricing has always been a mystery to me, a combination of art and science allowing the airline to make as much money as possible on each flight while providing the customer the options and flexibility they want. Under the covers I know there are complex models the airlines use to determine how many seats have been sold and how much they can get for the remaining seats. I didn’t realize how seriously complex the models were but more importantly, the opportunity available to the travel industry to become more customer-centric while staying competitive by harnessing the data now available to them.…

Today was our last day at the Worldwide Partner Conference (WPC) where 15,000+ people joined up for business sessions, networking, exhibits, heat, humidity, Lenny Kravitz and fantastic Houston Texas hospitality.  As a first time sponsor we thought we would share our views from the conference.

Steve Ballmer opened the conference talking about the Microsoft transformation to a devices-and-services company and the four trends underpinning that transformation – cloud, mobility, big data and enterprise social.…

By now, you’re probably well aware of what Hadoop does:  low-cost processing of huge amounts of data. But more importantly, what can Hadoop do for you?

We work with many customers across many industries with many different specific data challenges, but in talking to so many customers, we are also able to see patterns emerge on certain types of data and the value that could bring to a business.

We love to share these kinds of insights, so we built a series of video tutorials covering some of those scenarios:

Some more detailed discussion of these types of data is in our ‘Business Value of Hadoop’ whitepaper.…

BAM! What a week for Hadoop as we all spent time with around 2500 of our closest friends to spin some YARNs (I saw it over here first). Like me, you’re probably still digesting everything you heard but in the meantime here are some highlights from us.

Modern Data Architecture. Integrating Hadoop into existing data center investments is a hot topic for any enterprise thinking about Big Data. In support of that need there were some announcements with key data center partners:

The following is a guest post from Scott Gnau, President, Teradata Labs

I continue to be astonished by the evolution of Apache Hadoop, the software framework for large scale computing that has flourished thanks to a dynamic open source ecosystem. An army of contributors, including the smart engineers and contributors at Hortonworks, constantly refines Hadoop’s ability to manage massive amounts of data on computer clusters via MapReduce processing and the underlying Hadoop Distributed File System (HDFS).…

We are excited to announce today that Hortonworks is bringing Windows-based Hadoop Operational Management functionality via Management Packs for System Center. These management packs will enable users to deploy, manage and monitor Hortonworks Data Platform (HDP) for both Windows and Linux deployments. The new management packs for System Center will provide management and monitoring of Hadoop from a single System Center Operations Manager console, enabling customers to streamline operations and ensure quality of service levels.…

Four years ago, Arun Murthy entered a JIRA ticket (MAPREDUCE -279) that outlined a re-architecture of the original MapReduce.  In the ticket, he outlined a set of capabilities that allowed processes to better share resources and an architecture that would allow Hadoop to extend beyond batch data processing.

It turned out that this ticket was prescient of true enterprise requirements for Hadoop. As enterprise adoption accelerated, it became even clearer that multiple processing models – moving beyond batch – was critical for Hadoop to broaden its applicability for mainstream usage in the modern enterprise architecture.…

Today our partner Teradata announced a new offering called the Teradata Portfolio for Hadoop, which is built upon the 100% open source Hortonworks Data Platform (HDP). The new products and expanded partnership with Hortonworks offers customers a flexible choice of deployment offerings for Apache Hadoop from one of the most trusted vendors in the data management market worldwide.

Trusted Adviser

Teradata have been helping their customers to get more value from their data for more than 30 years so this is a natural next step as organizations are looking to evolve their data architectures to capture net new data sources and create new applications.…

This post is from Steve Loughran, Devaraj Das & Eric Baldeschwieler.

In the last few weeks, we have been getting together a prototype, Hoya, running HBase On YARN. This is driven by a few top level use cases that we have been trying to address. Some of them are:

  • Be able to create on-demand HBase clusters easily -by and or in apps
    • With different versions of HBase potentially (for testing etc.)
  • Be able to configure different Hbase instances differently
    • For example, different configs for read/write workload instances
  • Better isolation
    • Run arbitrary co-processors in user’s private cluster
    • User will own the data that the hbase daemons create
  • MR jobs should find it simple to create (transient) HBase clusters
    • For Map-side joins where table data is all in HBase, for example
  • Elasticity of clusters for analytic / batch workload processing
    • Stop / Suspend / Resume clusters as needed
    • Expand / shrink clusters as needed
  • Be able to utilize cluster resources better
    • Run MR jobs while maintaining HBase’s low latency SLAs

The Hoya tool is a Java tool, and is currently CLI driven.…

We are delighted to announce a new round of funding led by new investors Tenaya Capital and Dragoneer Investment Group, with participation from our existing investors Benchmark Capital, Index Ventures and Yahoo!.

I could not be more excited about the opportunity in front of us.  The market reception of our business model and strategy of ensuring 100% open source Apache Hadoop becomes an enterprise viable data platform is resonating strongly with the market.…

Go to page:« First...10...2122232425...3040...Last »