Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 15, 2018 | Matt Spillar | Hortonworks Case Study

Leveraging Diverse Datasets Across 65 Million Devices to Derive New Insights

August 14, 2018 | George Vetticaden | Thought Leadership

Addressing Kafka Blindness

August 10, 2018 | Roni Fontaine

Hadoop 3 Blog Series – Recap


All Topics

All Channels


We’ve just published our most recent case study! This story gives an in-depth look at how Pinsight Media is leveraging diverse datasets to drive new insights. Pinsight Media (Pinsight) is the mobile data company that uses verified, network-level data to fuel intelligent brand decisions. Its data management platform ingests over 60 terabytes of data coming from 65 million […]

IN HDF 3.1, we announced support for Apache Kafka 1.0 with powerful integrations with Apache Ambari, Apache Ranger, and Apache Atlas. In the last 12 months, we have seen Kafka emerge as a key component in many of our customers streaming architectures’. A common architecture looks something like the following: Different customers or even different groups within […]

Hadoop 3 Blog Series – Recap

  The Hortonworks HDP product and engineering teams are excited to share details on Apache Hadoop 3.x over a series of blogs. Please check out blogs we already published for our Apache Hadoop 3.x blog series and stay tuned for many more. Hadoop 3.X Apache Hadoop 3.1.0 released. And a look back! Apache Hadoop 3.1 […]

This is the finale of the blog series (see part 1, part 2, part 3) where having discussed the problem domain, looked at the functional and architectural aspects and prepared the environment, we are now ready to execute a few pricing calculations. SSH into the cluster gateway node and download the following from repo: compute/compute-engine-spark-1.0.0.jar […]

This is the 3nd blog in the series (see part 1, part 2) where we will walk through the tech stack and prepare the environment. The entire infrastructure is provisioned on OpenStack private cloud using Cloudbreak 2.7.0 which first automates the installation of Docker CE 18.03 on CentOS 7.4 VMs and then the installation of […]

Earlier this week, Hortonworks announced financial results for the second quarter of 2018. Momentum is high at the halfway point of the year, as our CEO Rob Bearden announced another record breaking quarter! The Q2 revenue of $86.3 million marked a total revenue growth of 40 percent compared to the prior year. Q2 also was another quarter […]

This is the 2nd blog in a 4-part blog series (see part 1) where we will dive into representative pricing semantics and architectural aspects of a prototype implementation using HDP 3.0.  Pricing Semantics  The engine leverages QuantLib, an open source library for quantitative finance, to compute: Spot Price of a Forward Rate Agreement (FRA) using the […]

Introduction Lifetime indicates the overall time spent by an application in YARN. The lifetime of an application is calculated from its start time to finish time, including the actual run time as well as the wait time for resource allocation. Both users and administrators on the YARN system might occasionally be required to restrict the […]

This post was authored by Ram Venkatesh, Hortonworks VP, Engineering; and James Malone, Cloud Product Manager, Google.  If you’re looking for a fully-managed cloud service for running data and analytics clusters, thanks in large part to the Apache Hadoop and Spark communities, you might very well look to Cloud Dataproc, which offers both long-running and […]

This is the 1st blog in a 4-part blog series where we will look at an architectural approach to implementing a distributed compute engine for pricing financial derivatives using Hortonworks Data Platform [HDP] 3.0. In this blog, we will discuss the problem domain and set the context before we zoom in on the functional and […]

The traditional retail business model is evolving rapidly right before our eyes, as early retail e-commerce sites once effectively emulated stores with the assumption that consumers would begin and end their shopping trip in that channel, just as they did in the store.  Fast forward to today, and all that has changed in the digital […]

Meetup Link: The Bangalore Apache Hadoop Meetup group, with over 3400 members who share common interests and ideas in the Hadoop ecosystem, brings together a community of practitioners and developers at Bangalore. The talks at this meetup cover a variety of topics related to the Hadoop ecosystem, such as Data Science workloads, Big Data-Driven […]

This blog is a first in a series of security-related blogs that we plan to publish in the near future. It’s a myth that usability and security are mutually exclusive. In this blog, we’ll try to dispel it in the context of Apache Knox. For those who are not familiar with Apache Knox, it is: […]

This blog is also co-authored by Zian Chen and Sunil Govindan from Hortonworks. Introduction – Apache Hadoop 3.1, YARN, & HDP 3.0 GPUs are increasingly becoming a key tool for many big data applications. Deep-learning / machine learning, data analytics, Genome Sequencing etc all have applications that rely on GPUs for tractable performance. In many cases, […]

Enterprise Data Warehouse (EDW) is traditionally used for generating reports and answering pre-defined queries, where workloads and requirements for service level are static. The drawback is that the platforms impose rigidity, because the schemas must be modeled in advance for queries that are anticipated. Constrained by this limitation, users cannot freely explore and ask questions […]

Social Media News

@hortonworks: Be sure to register and attend our upcoming #meetup. In this meetup we’ll start with the current status of Apache H…

@hortonworks: The formula for your business success: #DataStrategy = #CloudStrategy = #BusinessStrategy. #Data #Cloud #BigData

@hortonworks: Don't miss out on our upcoming webinar on August 29 where we will examine #IoT communication, #data streaming, inge…

@hortonworks: Hortonworks Data Platform 3.0 is here! Faster, Smarter #Hybrid #Data! Learn More -

@hortonworks: "Only Hortonworks Data Platform provides both the scale and agility we need to let our clients be more precise in t…

@hortonworks: #Kafka’s Omnipresence has led to Kafka blindness – the enterprise’s struggle to monitor, troubleshoot and see whats…

@hortonworks: #Kafka’s Omnipresence has led to Kafka blindness - the enterprise’s struggle to monitor, troubleshoot and see whats…

@hortonworks: The Fundamentals of #IoT Architecture. Register for our upcoming webinar where we'll examine #IoT communication,…

@hortonworks: 🙌🙌🙌

@hortonworks: HDP 3.0 Sandbox is COMING SOON! Sign up to get notified #faster #smarter #hybrid