The Hortonworks Blog

Posts categorized by : Innovation from Hortonwoks

At Hortonworks, we are always watching emerging trends in the datacenter to find opportunities for deeper ecosystem integration with Apache Hadoop in simple and intuitive ways. We first partnered with OpenShift by Red Hat earlier this year when we made it possible to call out to Hadoop services from OpenShift via cartridges. You can read more about that solution here. As Enterprise Cloud (e.g. PaaS) offerings have matured to support a broad set of workloads, we’ve had a number of our customers ask about how Hadoop-centered Big Data and PaaS initiatives could work together – particularly in light of Apache Hadoop YARN being the multi-workload resource manager for batch, interactive and real-time workloads on Hadoop.…

BlueData™, a new Hortonworks Certified Technology Partner, is a pioneer in Big Data private clouds that help enterprises create a self-service cloud experience on premise. BlueData has been recently certified with Hortonworks Data Platform (HDP).

BlueData’s Director of Business Development, Rashmi Gopinath, describes the interworking and advantages of the BlueData EPIC™ solution with HDP.

Last week BlueData announced the launch of EPIC™ Enterprise, a Big Data private cloud solution, and its subsequent certification on Hortonworks Data Platform.…

Hortonworks’ strategy, since our inception, has been extremely consistent: enable a modern data architecture whereby users have the ability to store data in a single location and interact with it in multiple ways – using the right data processing engine at the right time.  At the core of that strategy is YARN, which as a part of Apache Hadoop, allows multiple data processing engines to interact with data stored in a single platform, unlocking an entirely new approach to analytics.…

As more companies turn to Hadoop as a crucial data platform, we are seeing security considerations continuing to play a much bigger role. Dataguise DgSecure works in concert with the Hortonworks Data Platform (HDP) to bring enterprise grade security and insight to Hadoop deployments. Data governance professionals can employ critical security features such as centrally managed authorization and audit, as well as sensitive data discovery, data centric protection and reporting to their Hadoop deployments.…

Concurrent Inc. is a Hortonworks Technology Partner and recently announced that Cascading 3.0 now supports Apache Tez as an application runtime. Cascading is a powerful development framework for building enterprise data applications on Hadoop and is one of the most widely deployed technologies for data applications, with more than 175,000 user downloads a month. Used by thousands of businesses including eBay, Etsy, The Climate Corp and Twitter, Cascading is the de facto standard in data application development on Hadoop.…

Internet of Things (IoT) Potential and Process

It may seem obvious (or inevitable), but many companies are embracing the Internet of Things (IoT)—and for good reasons, notes Forbes’ Mike Kavis. For one, McKinsey Global Institute reports that IoT business will reach $6.2 trillion in revenue by 2025. And second, more and more objects are becoming embedded with sensors that communicate real-time data to data centers’ networks for processing, explain McKinsey’s Chui, Loffler, and Roberts.…

On September 17, the Apache Software Foundation (ASF) voted to graduate Apache Storm to a top-level project (TLP). This represents a major step forward for the project and represents the momentum built by a broad community of developers from not only Hortonworks, but also Yahoo!, Alibaba, Twitter, Microsoft and many other companies.

What is Apache Storm and why is it useful?

Apache Storm is a distributed, fault tolerant, and highly scalable platform for processing streaming data.…

The Apache Tez community is thrilled to announce the release of version 0.5 of the project. We’re referring to this as “the developer release” because it’s all about developers. The community focused on meeting the key needs of developers using Tez to create their applications and engines. Tez 0.5 includes clean and intuitive developer APIs, easy debugging, extensive documentation and deployment with rolling upgrades.

Apache Hadoop YARN paved the way for Apache Tez.…

Summary

This blog covers how recent developments have made it easy to use ORCFile from Cascading or Apache Crunch and that doing so can accelerate data processing more than 5x. Code samples are provided so that you can start integrating ORCFile into your Cascading or Crunch projects today.

What are Cascading and Apache Crunch?

Cascading and Apache Crunch are high-level frameworks that make it easy to process large amounts of data in distributed clusters.…

Hortonworks is committed to collaborate with ISVs and partners to onboard their applications to YARN and Hadoop. As part of the YARN Webinar Series, we have introduced different methods to help you integrate your applications to YARN: Native YARN integration, Slider and Tez. As part of this series, we now offer the opportunity to learn Scalding, with guest speaker from Twitter, who will talk about simplifying application development on Apache Hadoop and YARN.…

Novetta is a new Hortonworks Technology Partner and recently achieved HDP 2.1 Certification and YARN Ready status. In this guest blog, Jennifer Reed, director of product management at Novetta, talks about Novetta’s YARN Ready entity resolution and relationship dimension-building application.

The New Era of Analytics

Thomas Davenport, in his keynote at the Hadoop Summit San Jose 2014, said that the big data analytics has entered a new phase: From Analytics 2.0 to 3.0.…

Modern retailers collect data from a multitude of consumer engagement channels, including point of sale systems, the web, mobile applications, social media, and more. They hope to use this data to derive greater customer insights, promote increased brand engagement and loyalty, optimize pricing and promotions, streamline the supply chain, and enhance their business models.

Data from the retailer’s transactional systems has historically been stored in an enterprise data warehouse (EDW) or other database, but these traditional data repositories are not well suited for the newer, unstructured data types like log files, social media updates and information from in-store sensors.…

StackIQ, a Hortonworks technology partner, offers a comprehensive software suite that automates the deployment, provisioning, and management of Big Infrastructure. In his second guest blog, Anoop Rajendra (@anoop_r), a Senior Software Developer at StackIQ, gives instructions for using StackIQ Comand Line Interface (CLI) to deploy a Hortonworks Data Platform (HDP) cluster.

In a previous blog post, we discussed how StackIQ’s Cluster Manager automates the installation and configuration of an Apache Ambari server.…

Thanks to all who joined us on our Hortonworks/Voltage webinar, “Securing Hadoop: What are Your Options?” For those who couldn’t attend, we’re sorry we missed you. We’ve included a link to the webinar recording below, and please listen in!

On the webinar, Hortonworks’ Vinod Nair presented the recently-announced Apache Argus incubator: a central policy administration framework across security requirements for authentication, authorization, auditing and data protection. Sudeep Venkatesh, of Voltage Security, defined data-centric protection technologies that easily integrate with Hive, Sqoop, MapReduce and other Hadoop interfaces.…

Hortonworks and Informatica have teamed up to provide the data systems and tools making up the foundation of the modern data architecture. Today, Scott Hedrick, Director of Big Data Partnerships at Informatica, tells us more about the brand new Informatica Big Data Edition Trial Sandbox for Hortonworks

With the help of our friends at Hortonworks, the Informatica Big Data team has preinstalled a 60-day trial version of the Informatica Big Data Edition into the Hortonworks Sandbox.…

Go to page:12345...10...Last »