The Hortonworks Blog

As we approach the opening bell on Nasdaq and another milestone for open source Apache Hadoop, we at Hortonworks want to thank those who have contributed deeply to this journey. We owe you – our customers – a huge thank you. Your active collaboration with us in the Apache Hadoop community has greatly impacted the trajectory of this platform for data management and has established a path for how thousands of other enterprises can successfully build a new open data architecture that brings all data under management.…

Many types of industries are finding new opportunities from an abundance of new types of data stored at scale in Hadoop, combined with Hadoop’s ability to process that data at lower costs than traditional platforms. Apache Hadoop and the Hortonworks Data Platform (HDP) can help enterprises turn what used to be data fumes into high-octane fuel that propels their businesses.

Sign up for the Hadoop industry solutions email series to find out how Hortonworks customers use Hadoop to solve real-world business challenges.…

The public sector is charged with protecting citizens, responding to constituents, providing services and maintaining infrastructure. In many instances, the demands of these responsibilities increase while government resources simultaneously shrink under budget pressures.

How can Intelligence, Defense and Civilian agencies do more with less?

Apache Hadoop is part of the answer. Within the public sector, Hadoop delivers data-driven actions in support of IT efficiency and good government.

Download the White Paper

In one example, the United States Internal Revenue Service had to reduce its auditor headcount due to budget cuts.…

With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. More and more independent software vendors (ISVs) are developing applications to run in Hadoop via YARN. This increases the number of users and processing engines that operate simultaneously across a Hadoop cluster, on the same data, at the same time.…

Introduction

In this 2nd part of the blog post and its accompanying IPython Notebook in our series on Data Science and Apache Hadoop, we continue to demonstrate how to build a predictive model with Apache Hadoop, using existing modeling tools. And this time we’ll use Apache Spark and ML-Lib.

Apache Spark is a relatively new entrant to the Hadoop ecosystem. Now running natively on Apache Hadoop YARN, the architectural center of Hadoop, Apache Spark is an in-memory data processing API and execution engine that is effective for machine learning and data science use cases.…

As more organizations consider the cloud as a component of their Apache Hadoop deployments, we can look to our partners for a range of solutions designed to meet these needs. This is the first post in a series on partner solutions available for deploying Hadoop in the cloud. We will build on the Hybrid deployment post with general use cases for Hadoop in a Hybrid cloud. Through our partners we have broad set of options for the cloud available today spanning on-premises, virtual and cloud-based deployments.…

The successful Hadoop journey typically starts with new analytic applications, which lead to a Data Lake. As more and more applications are created that derive value from the new types of data, an architectural shift happens in the data center: companies gain deeper insight across a large, broad, diverse set of data at efficient scale. They create a Data Lake.

Cisco and Hortonworks have partnered to build a highly efficient, highly scalable way to manage all your enterprise data in a data lake.…

Hortonworks is pleased to be part of the “going green” movement and even more pleased to introduce guest bloggers from Actian and Slingshot Power. In this blog, Slingshot Power describes their use case on how Hadoop and analytics can influence and increase the adoption of clean energy use.

By Ashish Gupta, CMO & SVP Business Development, Actian

Recently, we announced with Slingshot Power their use of Hortonworks Data Platform (HDP) and the Actian Analytics Platform – Hadoop SQL Edition.…

A Cosmopolitan Metropolis

Brussels, Belgium, conjures images of a cosmopolitan metropolis, where geopolitical summits are held, where world economic forums are debated, where global European institutions are headquartered, and where citizens and diplomats fluently converse in more than three languages—English, French, Dutch or German, along with other non-official local flavors.

To this colorful collage, add the image of a Hadoop Summit Europe 2015 for big data developers, practitioners, industry experts, and entrepreneurs, who make a difference in the digital world, who fluently code in multiple programming languages—Java, Python, Scala, C++, Pig, SQL, or R—and innovate and incubate Apache projects.…

Big data continues to dominate the discussion as businesses both big and small seek to make sense of what exactly it is, and more importantly, what they should do about it. The three biggest challenges associated with big data investments include determining how to get value from data, defining the big data strategy, and obtaining the skills and capabilities needed to make sense of it in a meaningful way.

Join our webinar Thursday Nov.

Increasingly, companies around the world are adopting Apache Hadoop as a core component of their Modern Data Architecture (MDA) in order to collect, store, analyze and manipulate massive quantities of data on their own terms—regardless of the source of that data, how old it is, where it is stored, or under what format. Once they build their Modern Data Architecture, what is the best way for them to manage and monitor their Hadoop clusters?…

News of customer data breaches seems to hit the headlines every week and we know that attackers have become more sophisticated in their tactics. Organizations too must step up their capabilities and build robust, data-driven defense systems. Join us for a webinar on Nov. 12 to learn about the current threats against enterprises like yours, and how a Modern Data Architecture (MDA) with Hortonworks Data Platform (HDP) and Sqrrl Enterprise can enable intuitive exploration, discovery and pattern recognition over your big cyberdata.…

In part 1, Kenneth Peeples, JBoss technology evangelist and principal marketing manager for Data Virtualization and Fuse Service Works at Red Hat, gave us an overview of the Red Hat and Hortonworks webinar series and offered insights into JBoss Data Virtualization and HDP. He started with an overview of data virtualization with the Hortonworks Data Platform and went over the first use case, Sentiment and Sales Analysis. Today, he describes the three other use cases.…

Recently the Oracle Data Integrator products were certified on the Hortonworks Data Platform version 2.1 and we’re delighted to be working more closely with Oracle engineering on these kinds of efforts. We’re happy to bring this guest blog to you today, written by Alex Kotopoulis, Product Manager, Oracle Data Integration for Big Data, at Oracle to discuss the recent integration and certification initiatives. You can learn more by joining our webinar on November 11, register here.…

Back in September, we presented a 3-part webinar series on our collaborations with Red Hat. Close to a thousand registrants and attendees participated and provided rich interaction to our series. The content included an overview of our strategic partnership, demonstrated a couple of demos, and provided tutorials to get you started on your Big Data journey with Red Hat and Apache Hadoop.

In this blog, Kenneth Peeples, JBoss technology evangelist and principal marketing manager for Data Virtualization and Fuse Service Works at Red Hat, recaps the webinar series and offers insights into JBoss Data Virtualization and HDP.…