Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
September 22, 2017 | Vinod Kumar Vavilapalli | From the Dev Team

YInception: A YARN based container cloud and how we certify Hadoop on Hadoop

September 21, 2017 | Carter Shanklin | Hadoop Ecosystem, From the Dev Team

3x Faster Interactive Query with Apache Hive LLAP

September 20, 2017 | Vinay Shukla | From the Dev Team

Data Science for the Modern Data Architecture

Viewing posts: Hadoop Insights« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

Thank you for reading part 1 of a 2 part series for how to update Hive Tables the easy way.  This is part 2 of the series. Managing Slowly Changing Dimensions In Part 1, we showed how easy it is update data in Hive using SQL MERGE, UPDATE and DELETE. Let’s take things up a notch […]

So you might ask, Why run Data Science in the Cloud? Just as Apache Hadoop allows data scientists to work with any type of data, both Hadoop and the cloud make it possible to work with any amount of data by providing ready access to unlimited storage and capacity. Hadoop allows the use of commoditized […]

Traditional data flow providers have proven to be expensive, inflexible, and unable to meet the demands of many real-time streaming data sources. For a modern enterprise looking to unlock the value of their existing data assets and integrate the latest streaming analytics, it’s vital to choose a data flow provider that’s flexible enough to handle […]

What’s New in Ambari 2.5 With Ambari 2.5, our focus was to continue to improve Hadoop operator’s day-to-day cluster management experience.  The community’s goal with Apache Ambari is to provide the most intelligent, easy-to-use, and extensible Hadoop operations experience, and with 2.5 we’ve made important improvements in the following areas: Service Management Log Management Configuration […]

LLAP wins the fastest execution among the SQL engines! Comcast is one of the nation’s leading providers of communications, entertainment and cable products and services. Headquartered in Philadelphia, PA, they employ over 100,000 employees nationwide whose goal is to deliver the highest level of service and improve the customer experience. Comcast decided to run what […]

The path to a successful big data implementation isn’t straightforward. There are many decisions and considerations along the way – from technology, to people to process – all these need to come together for a successful outcome. Hortonworks is in the business of helping enterprises achieve their desired business outcomes with big data as effectively […]

Recently Shaun Connolly (of Hortonworks) and Tony Baer (of Ovum) presented “Get Started with Big Data in the Cloud”.  During this webinar, they discussed the opportunity to take advantage of the cloud for big data workloads. As we see an increase in data analytics in the cloud, we are also seeing an increase in data […]

Apache Spark 2.1 Improves in Structured Streaming and Machine Learning. Structured Streaming: Kafka .10 support, Metrics & Stability improvements Machine Learning: SparkR Improvements including new ML algorithms for LDA, Random forests, GMM, etc. The recent release of Hortonworks Data Platform 2.6 (“HDP 2.6”) includes Apache Spark 2.1. And Hortonworks Data Cloud (“HDCloud”) for AWS gives […]

Last week, we hosted  Get Started with Big Data in the Cloud ASAP webinar with speakers from Hortonworks, Shaun Connolly and Ovum, Tony Baer. The webinar provided a very informative overview around the challenges enterprises are facing with the overwhelming number of choices available in the cloud. It covered how businesses can get over the […]

Expressway Authorities do Hadoop Every day, Expressway Authorities must make critical decisions — often times without sufficiently accurate and transparent data. At the same time, they may be losing revenue due to reporting latency and the inability to respond when toll plaza sensors are down. Hortonworks DataFlow (HDF™) and Hortonworks Data Platform (HDP®), can help resolve these […]

The value of any data is proportional to the insights derived from it. With the Data Lake Architecture, all of the enterprise data is made available in one place. The key to driving insights from the Data Lake is Apache Spark & Apache Zeppelin. Both are key tools to drive Predictive Analytics and Machine Learning. […]

Destination Autonomous The march towards autonomous vehicles continues to accelerate. While expert opinion differs on the specific timing and use cases that will emerge first, few deny that self-driving cars are in our future. Not surprisingly, when reviewing Big Data strategies with my automotive clients, discussions on data management strategies for autonomous driving research inevitably […]

The latest version of Hortonworks Data Platform (HDP) introduced a number of significant enhancements for our customers. For instance, HDP 2.6.0 now supports both Apache Spark™ 2.1 and Apache Hive™ 2.1 (LLAP™) as GA. Often customers store their data in Hive and analyze that data using both Hive and SparkSQL. An important requirement in this scenario […]

Hortonworks University is offering 3 new training classes in the month of June.  These classes are currently only available to be delivered by Hortonworks and are delivered in a virtual format, so no travel is required.  You can enroll online via credit card (US and CAD only) by following the links below. HDP Developer Quick […]

Hortonworks continues to advance the Hortonworks Data Platform (HDP) as an integrated portfolio of enterprise security and governance products for big data. By building security and data governance into the platform we ensure that these capabilities are administered consistently across all the components or data engines, and when new engines are added to the platform they inherit […]