cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
May 26, 2017 | Tom Hastain | Hortonworks Case Study

Precision Medicine: a 5 Million Person Case Study

May 26, 2017 | Carole Gum | Hortonworks Community Connection

Don’t miss the Business of Data at DataWorks Summit

May 26, 2017 | Anna Yong

Open Source Talent Powers Big Data Success

Viewing posts by: Carter Shanklin« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

Hive / Druid integration means Druid is BI-ready from your tool of choice This is Part 3 of a Three-Part series of doing ultra fast OLAP Analytics with Apache Hive and Druid. Connect Tableau to Druid Previously we talked about how the Hive/Druid integration delivers screaming-fast analytics, but there is another, even more powerful benefit to […]

The value of any data is proportional to the insights derived from it. With the Data Lake Architecture, all of the enterprise data is made available in one place. The key to driving insights from the Data Lake is Apache Spark & Apache Zeppelin. Both are key tools to drive Predictive Analytics and Machine Learning. […]

Simon Meredith, Chief Technology Officer – CSI, IBM Europe explains the significance of IBM & Hortonworks working together in the era of Big Data What is fuelling IBM’s commitment to Apache Hadoop and Spark? The pressures of day to day business are delaying companies doing more with their data. IBM’s commitment is to initiate, simplify […]

Destination Autonomous The march towards autonomous vehicles continues to accelerate. While expert opinion differs on the specific timing and use cases that will emerge first, few deny that self-driving cars are in our future. Not surprisingly, when reviewing Big Data strategies with my automotive clients, discussions on data management strategies for autonomous driving research inevitably […]

The latest version of Hortonworks Data Platform (HDP) introduced a number of significant enhancements for our customers. For instance, HDP 2.6.0 now supports both Apache Spark™ 2.1 and Apache Hive™ 2.1 (LLAP™) as GA. Often customers store their data in Hive and analyze that data using both Hive and SparkSQL. An important requirement in this scenario […]

In Part 1 of this series, we discussed how data-in-motion solutions require both flow management and stream analytics capabilities. Also, we introduced an exciting new technology that Hortonworks is in the process of releasing that helps users build streaming analytics apps faster and caters to three different personas in the enterprise: app developer, operations teams and the […]

  Thank you for reading our Data Lake 3.0 series! In part 1 of the series, we introduced what a Data Lake 3.0 is. In part 2 of the series, we talked about how a multi-colored YARN will play a critical role in building a successful Data Lake 3.0. In part 3 of the series, […]

As part of the product management leadership team at Hortonworks, there is nothing more valuable than talking directly with customers and learning about their successes, challenges, and struggles implementing their big data and analytics use cases with HDP and HDF. These conversations provide more insight than any analyst report, white paper, or market study. In […]

Carolinas HealthCare System is one of the leading healthcare organizations in the Southeast and one of the most comprehensive, not-for-profit systems in the country. Our more than 900 care locations include: Academic medical centers Hospitals Freestanding emergency departments Healthcare pavilions Physician practices Outpatient surgical centers Laboratories Rehabilitation centers Home health agencies Nursing homes Hospice and […]

Last week, we hosted a webinar: Combating Phishing Attacks: How Big Data Helps Detect Impersonators  where our audience confirmed that is really can take months, or even a year to investigate the repercussions of a breach such as a phishing attack. Due to the complex and dynamic nature of modern attack vectors, we discussed how […]

With the San Jose DataWorks Summit (June 13-15) just two months away, we’re busy finalizing the lineup of an impressive array of speakers and business use cases. This year our Enterprise Adoption Track will feature Jay Etchings, Director of Operations for Research Computing at Arizona State University. In February we announced Jay’s new book, “Strategies in Biomedical Data […]

Apache Spark is a powerful framework for data processing and analysis. Spark provides two modes for data exploration: Interactive: provided by spark-shell, pySpark, and SparkR REPLs Batch: using spark-submit to submit a Spark application to cluster without interaction in the middle of run-time. While these two modes look different on the surface, deep down they […]

You have heard about Big Data for a long time, and how companies that use Big Data as part of their business decision making process experience significantly higher profitability than their competition. Now that your company is ready to embark on its first Apache Hadoop® journey there are important lessons to be learned. Read on […]

Hive View 2.0 is New in Apache Ambari 2.5 Ambari’s Hive View gives analysts and DBAs a convenient web interface to Apache Hive which allows SQL analytics, data management and performance diagnostics. Ambari 2.5 introduces Hive View 2.0 with a brand new user experience plus a slew of great new tools to help DBAs run […]

In 2016, we published the second version v1.0.1 of Spark HBase Connector (SHC). In this blog, we will go through the major features we have implemented this year. Support Phoenix coder SHC can be used to write data out to HBase cluster for further downstream processing. It supports Avro serialization for input and output data […]