Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 13, 2017
prev slideNext slide

Top Hortonworks Blogs from 2017

As 2017 comes to an end we can all (hopefully) take a quick breather and prepare for the new year. Whether this means going offline and digital free, binge watching the shows you have been stockpiling, or something in between, we hope you have the opportunity to do a little bit of reading.


To help you with your reading list, here are some of the top blogs from Hortonworks in 2017, in order of publication date:

  • Try Apache Spark 2.1 & Zeppelin in Hortonworks Data Cloud by Vinay Shukla
    Wanna try Spark 2.1 Now? Well, you are in luck… Hortonworks Data Cloud (“HDCloud”) for AWS gives you a quick way to launch a Spark cluster in the cloud. Read More
  • Machine Learning & its Impact on the Future for Insurance by Cindy Maike
    First and foremost, machine learning WILL change the way insurers do business. The insurance industry is founded on forecasting future events and estimating the value/impact of those events and has used established predictive modeling practices – especially in claims loss prediction and pricing – for some time now. Read More
  • A Reference Architecture for the Open Banking Standard… by Vamsi Chemitiganti
    Financial services firms specifically deal with manifold data types ranging from Customer Account data, Transaction Data, Wire Data, Trade Data, Customer Relationship Management (CRM), General Ledger and other systems supporting core banking functions. When one factors in social media feeds, mobile clients & other non traditional data types, the challenge is not just one of data volumes but also variety and the need to draw conclusions from fast moving data streams by commingling them with years of historical data. Read More
  • Announcing the General Availability of HDP 2.6 by Wei Wang
    We develop the entire Hortonworks Data Platform to ensure our customers not only can adopt the latest innovation from the broader open source community, but also enjoy some of the enterprise ready and easy of use functionalities packaged within HDP 2.6. Read More
    If you are interested in HDP 2.6, you will also probably want to see this blog as well:
    Announcing the General Availability of HDP 2.6.3 and Hortonworks DataPlane Service
  • Top 5 Performance Boosters with Apache Hive LLAP by Carter Shanklin
    Now that LLAP is generally available with HDP 2.6, let’s take some time to look at the top 5 performance boosters you’re missing out on if you’re not using LLAP. Read More
  • Integrate SparkR and R for Better Data Science Workflow by Yanbo Liang
    To address R’s scalability issue, the Spark community developed SparkR package which is based on a distributed data frame that enables structured data processing with a syntax familiar to R users. Read More
  • Livy: A Rest Interface for Apache Spark by Saisai Shao
    In order to overcome the current shortcomings of executing Spark applications, and to introduce additional features, we introduce Livy – a REST based Spark interface to run statements, jobs, and applications. Read More
  • Benchmark Apache HBase vs Apache Cassandra on SSD in a Cloud Environment by Will Xu
    As more and more workloads are being brought onto modern hardware in the cloud, it’s important for us to understand how to pick the best databases that can leverage the best hardware. Amazon has introduced instances with directly attached SSD (Solid state drive). Both Apache HBase and Apache Cassandra are popular key-value databases. In this benchmark, we hope to learn more about how they leverage the directly attached SSD in a cloud environment. Read More
  • A Category Emerges: Introducing Hortonworks DataPlane Service by Scott Gnau
    Hortonworks DPS is a next-gen service to manage, govern and secure data and workloads across multiple sources (databases, EDWs, clusters, data lakes), types of data (at-rest, in-motion) and tiers (on-prem, multiple clouds, hybrid). It allows enterprises to focus on getting more value from data quicker by providing an intuitive experience for managing all data. Read More
  • Automated Validation for all of the Apache Hadoop Ecosystem by Ramya Sunil, Sunitha Velpula, Raja Aluri
    Unlike traditional enterprise software, we deal with an inflow of hundreds of Apache commits, across 25+ projects in the Apache Hadoop ecosystem. Apache community has a rich set of unit tests, which are continuously run (often, for every commit) to catch regressions early. However, they are not always sufficient to assess the impact on integrated functionality. This is where having a robust, scalable and reliable testing infrastructure to validate the multi-layer stack becomes crucial. Read More

If you liked this end of year wrap-up, check out our Top 11 Customer Stories from 2017


Leave a Reply

Your email address will not be published. Required fields are marked *