The Hortonworks Blog

In a world that creates 2.5 quintillion bytes of data every year, it is extremely cheap to collect, store and curate all the data you will ever care about. Data is de facto becoming the largest untapped asset. So how can organizations take advantage of unprecedented amounts of data? The answer is new innovations; and new applications. We are clearly entering a new era of modern data application

I would like to take the opportunity to share my Hadoop journey in the past 10 years, and discuss where I see the Hadoop technology going in the next decade.…

Today Microsoft has announced the Generally Availability of Azure HDInsight, with Apache Hadoop 2.6, available on Ubuntu Linux clusters. Azure HDInsight is a Hadoop managed service in the cloud and uses the Hortonworks Data Platform (HDP).

This release is a direct result of the commitment that Microsoft has to Open Source. Microsoft has worked along with Hortonworks® in the community to contribute towards Apache Hadoop and related projects, including Apache Ambari.…

Today, I’m excited to share that we have released the GA version of Hortonworks DataFlow (HDF), a new offering that directly addresses the unique big data needs of the Internet of Anything (IoAT). Hortonworks DataFlow is powered by Apache Nifi a top-level open source project made available through the NSA Technology Transfer Program.

By making this technology a commercial offering, we now provide our customers the ability to connect, collect and curate data from a broad spectrum of connected yet disparate data sources – sensors, machines, geo-location devices, social feeds, connected cars, web clicks, server logs and more.…

Since the partnership between Hortonworks and SAS we have created some awesome assets (i.e., SAS Data Loader sandbox tutorial, educational webinars and array of blogs) that have enabled Hadoop and Big Data enthusiasts’ hands-on training with Apache Hadoop and SAS’ powerful analytics solutions. You can find more details around our partnership and resources here: http://hortonworks.com/partner/sas

To continue the momentum, we have Paul Kent, Vice President of Big Data at SAS, share his insights on the value of  YARN and the benefits it brings to SAS and its users- this time around SAS Grid and YARN. …

Big Data, the Internet of Anything (IoAT) and the Connected Car have created a new Information Superhighway that fundamentally changes the relationship between automakers and car buyers.

Previously, automakers had an incomplete feedback loop after they sold a vehicle. They learned of negative customer sentiment through slumping sales, increasing warranty expenses or when they needed to recall their vehicles. Positive signals of driver happiness were similarly sparse.

Read the White Paper

The connected car has changed all that.…

In a world that creates 2.5 quintillion bytes of data every year, how can organizations take advantage of unprecedented amounts of data? Is data becoming the largest untapped asset? What architectures do companies need to put in place to deliver new business insights while reducing storage and maintenance costs?

Cisco and Hortonworks have been partnering since 2013 to offer operational flexibility, innovation and simplicity when it comes to deploying and managing Hadoop clusters.…

Yahoo! JAPAN needed a data platform that could scale to generate 100,000 reports per day as well as having the ability to process large amounts of data. It needed to keep the last 13 months’ worth of data, which is approximately 500 billion rows, organized and easily accessible. Relational Database Management Systems (RDBMS) cannot scale to these levels from a cost and processing power perspective. Yahoo! JAPAN explored Hadoop to achieve this and evaluated two platforms based on our requirements; Hortonworks Hive and Tez on YARN and Cloudera Impala.…

Hortonworks is a huge supporter of the Apache Software Foundation (ASF) and fully embrace the processes and procedures through the only 100% open source Hadoop platform HDP. As Forrester VP Mike Gualtieri said in the Forrester Wave “Hortonworks lives and loves open source.” And that will be fully on display at the inaugural at Apache: Big Data Europe 2015 event this year in Budapest, Hungary.

The event will be held 28-30 September at the Corinthia Hotel and Hortonworks will be contributing in a big way as a Diamond Sponsor.…

On September 22nd at 10:00 am PST, Vincent Lam, Director of Product Marketing at Protegrity, and Syed Mahmood, Sr. Product Marketing Manager at Hortonworks, will be talking about how to secure sensitive data in Hadoop Data Lakes.

Register Now

In this blog, they provide answers to some of the most frequently asked questions they have heard on the topic.

  • What’s the best approach for the security of Hadoop Data Lakes?
  • As enterprises continue to harness the power of Hadoop to store large amounts of data, security becomes an even more important part of the ecosystem.…

    Over multiple conversations and espressos, Steven Witt, Senior Director of Industry Solutions at Hortonworks, and I have been exploring the diverse challenges associated with collecting, conducting and curating data flows from the well site.

    Steven recently joined Hortonworks when we acquired Onyara. Steven was the Onyara CEO and co-founder. This is the first in a series summarizing our conversations, focused on how Hortonworks DataFlow collects data from the field in upstream oil and gas operations, then conducts that through to the data center and back in order to make critical decisions related to drilling and production.…

    Watch the “Integrating Data Silos with your Big Data Systems in Real-Time” webinar below.

    Big Data projects promise a flexible and cost-effective data architecture that will immediately make previously untapped data sources accessible to derive business value. Integrating the new sources such as Internet of Things, semi or un-structured data with the traditional structured data typically found in business applications or data warehouses is the key. Join Oracle & Hortonworks in this four part webinar series focusing on Big Data Integration.…

    Symantec helps consumers and organizations secure and manage their information-driven world by protecting digital information and online transactions.

    The Symantec Cloud Platform team turned to Hortonworks to ingest an enormous volume of security logs, analyze that security metadata and then use that insight to protect its customers. Symantec now analyzes threat data much more quickly because it optimized its data architecture using the storage and processing power of HDP—for both historical and real-time analysis.…

    Today we are thrilled to officially open our new international headquarters in the heart of London. This new office is a substantial increase and upgrade to the space we had been occupying for more than a year, and was much needed in order to accommodate our continued growth and success in international markets. We’ve reached a very exciting stage in our growth. The move to new, larger premises in the City of London will allow us to better serve our customers, partners and the Hadoop Community and up the tempo of our international commercial activities as a result.…

    We are excited to announce the general availability of Hortonworks Sandbox with HDP 2.3 on Microsoft Azure Gallery. Hortonworks Sandbox is already a very popular environment for developers, data scientists and administrators to learn and experiment with the latest innovations in Hortonworks Data Platform.

    The hundreds of innovations span across Apache Hadoop, Kafka, Storm, Spark, Hive, Pig, YARN, Ambari, Falcon, Ranger and other components that make up HDP platform.…

    Guest blogger David Hill, Business Development Director at Open Energi, explains the challenges of building a virtual power station, and why data is the fuel. Follow Open Energi on @openenergi

    Open Energi is working with businesses in the UK to harness the flexible energy demand from their equipment and aggregating it to create a virtual power station. We’re turning the whole system on its head so that instead of energy supply adjusting to meet demand, our demand for energy adjusts to meet supply – in real-time.…