The Hortonworks Blog

In this Hortonworks’ partner guest blog, Abhimanyu Aditya, Senior Product Manager and co-founder at Skytree, explains how Skytree APIs solve challenges facing data engineers, simplifies data preparation and data transformation, using Apache Spark on YARN with Hortonworks Data Platform (HDP).

Challenges Facing Data Engineers and Data Scientists

Machine learning as a technology can be challenging. It is difficult to create, understand and deploy machine learning models. Even before the modeling process can begin, the data needs to be prepared for machine learning and modern data scientists, developers, hackers, Ph.D.’s, analysts and domain experts spend a significant amount of time and effort doing this.…

Everyday more and more new devices—smartphones, sensors, wearables, tablets, home appliances—connect together by joining the “Internet of Things.” Cisco predicts that by 2020, there will be 50 billion devices connected to Internet of Things. Naturally, they all will emit streams of data, in short intervals. Obviously, these data streams will have to be stored, will have to be processed, and will have to be analyzed, in real-time.

Apache Storm is the scalable, fault-tolerant realtime distributed processing engine that allows you to the handle massive streams of data in realtime, in parallel, and at scale.…

It’s been 20 years since I was “the new Guy.”

Hello friends and colleagues. I wanted share some thoughts after my first 90 days at Hortonworks. It’s been a thrill ride to say the least, there is all of the normal new guy / first impression stuff – and for those of you who know me, you know I am very sensitive to all that!

Working with our founders and engineering team has been a blast.…

Hortonworks is hitting the road again with another worldwide roadshow. The Open Enterprise Hadoop roadshow will offer business and technical content, a chance to meet and speak with people who are driving innovation in Hadoop, and a valuable opportunity to network with others who are using Apache Hadoop to transform their business.

The format of these seminars is a full day of Hadoop content. We’ll open with a general session discussing technology innovations around Hadoop and how organizations are transforming their businesses with Hadoop.…

In this guest blog,  Murthy Mathiprakasam, principal product marketing manager at Informatica, tells us more about the partnership with Hortonworks and how the two companies optimize the entire big data supply chain on Hadoop, turning data into actionable information to drive business value.

There has never been a more exciting time in the world of data management. With a growing number and type of data consumers seeking access to growing data volumes and data varieties, the role of data in organizational success has never been more critical.…

In this guest blog, Oliver Chiu, Microsoft’s product marketing manager for Hadoop/Big Data and Data Warehousing, explains how customers can benefit from deploying Apache Spark and HDP on Azure HDInsight for their enterprise and mission-critical big data jobs.

On July 10, Microsoft announced the public preview availability of Apache Spark for Azure HDInsight.

Azure HDInsight is Microsoft’s managed Hadoop-as-a-service offering. It takes the Hortonworks Data Platform (HDP) and architects it for the cloud.…

Open Enterprise Hadoop is already transforming many industries, accelerating Big Data projects to help businesses translate information into competitive advantage.

I’d like to share a real-world example from the digital marketing powerhouse Webtrends, who’ve used the Hortonworks solution to launch a powerful new product line. First, a little context.

Everywhere you look, you can find companies using Open Enterprise Hadoop in large-scale projects to enable deep data discovery, to capture a single view of customers across multiple data sets, and to help data scientists perform predictive analytics.…

This is a guest blog from Stefan Kupstaitis-Dunkler, Accenture Technology Solutions GmbH.

I’ve been working at Accenture for almost a year and last month I was invited to attend the partner masterclass on HDP 2.3 Security. The classroom setting was a great forum for interactive discussions and a showcase of the security capabilities in the newest version of the Hortonworks Hadoop distribution, HDP 2.3.

Sean Roberts, Hortonworks Solution Engineer and hadoop operations expert in EMEA, guided the attendees through a demonstration of what Hortonworks Data Platform does to integrate the security aspects into Hadoop.…

On August 19th, Dr. Alexander Gray, CTO and Co-Founder, Skytree, and Cindy Maike, General Manager, Insurance at Hortonworks, will be joining Patricia Harman, Editor-in-Chief at Claims Magazine, for a Skytree webinar on “Driving profitability and lowering costs using Machine Learning on Hadoop.”

Register for the Webinar on August 19th at 10am Pacific/1pm Eastern time

In this blog, Alex and Cindy exchange perspectives on what machine learning means for insurers, and where opportunities are for its application.…

Today, we were pleased to announce the selection of Hortonworks to the EMC Select Program. We’re delighted to host this guest blog from from Ryan Peterson, the Chief Solutions Strategist at EMC. Follow Ryan @BigDataRyan.

Today, Hortonworks and EMC announced a broadened relationship. Hortonworks announced that EMC has chosen Hortonworks as an EMC Select Partner,which means that EMC will now be able to resell the Hortonworks Data Platform (HDP).…

With new types of data driving growth and competitive advantage as never before, it’s crucial to implement the right solution for your data lake. Power and performance are important, of course, as are security and efficiency, but more fundamentally, you also need to be sure it’s going to work as advertised. That’s where the EMC Business Partner Program for Technology comes in, helping customers find industry-leading technologies that are optimized, validated and certified with EMC technologies.…

Last week, on July 22nd, we announced the general availability of HDP 2.3. Of the three part blog series, the first blog summarized the key innovations in the release—ease of use & enterprise readiness and how those are helping deliver transformational outcomes—while the second blog focused on data access innovation. In this final part, we explain cloud provisioning, proactive support, and other general improvements across the platform.

Automated Provisioning with Cloudbreak

Since Hortonworks’ acquisition of SequenceIQ, the integrated team has been working hard to complete the deployment automation for public clouds including Microsoft Azure, Amazon EC2, and Google Cloud.…

Along with the Hortonworks Oil and Gas team, I have been working closely with Laurence Sones, senior petrophysicist, to understand how Hadoop-based Data Discovery is enabling Geologic and Geophysical (G&G) teams to improve decision-making across their assets. What follows is a Q&A session with Laurence discussing his perspectives on data discovery.

Kohlleffel: Laurence, you have a wealth of experience in the oil and gas industry. Please discuss your background and some of the roles that you have taken on.…

Communication service providers aim to enhance customer experience and build strong and long-lasting relationships with their customers. This has become increasingly difficult as customer interactions now occur across many channels. Hence, it’s important to understand customer behavior across all channels to create the best experience for each individual. Join us on August 5 for a webinar with Hortonworks and Apigee to learn more.

Register Now

In today’s guest blog post, Sanjay Kumar, General Manager, Telecommunications at Hortonworks, and Sanjeev Srivastav, Vice President, Data Strategy at Apigee, discuss how service providers can capture and visualize customer behavior as a graph connecting the interaction points such as IVR, chat and call events, and combine it with network data to predict future call or chat patterns.…

On August 4th at 10:00 am PST, Eric Thorsen, General Manager Retail/CP at Hortonworks and Krishnan Parasuraman, VP Business Development at Splice Machine, will be talking about how Hadoop can be leveraged as a scale-out relational database to be the System of Record and power mission critical applications.

In this blog, they provide answers to some of the most frequently asked questions they have heard on the topic.

Register Now

  • Hadoop is primarily known for running batch based, analytic workloads.