The Hortonworks Blog

Posts categorized by : Apache Hadoop

Last week’s Hortonworks webinar “What’s Possible with a Modern Data Architecture?” featured Greg Girard, program director for omni-channel analytics strategies at IDC Retail Insights and Mark Ledbetter, vice president for industry solutions at Hortonworks. Greg provides targeted, fact-based guidance to retailers for the application of analytics across the enterprise. Mark has more than twenty-five years experience in the software industry with a focus on retail and supply chains.

Many of Greg and Mark’s thoughts from the webinar echo topics also covered in the recent Hortonworks white paper “The Retail Sector Boosts Sales with Hadoop.”

Download White Paper

Greg discussed the most significant drivers of big data initiatives in the retail industry, including customer acquisition, pricing strategies or competitive intelligence.…

At Hortonworks, we are always watching emerging trends in the datacenter to find opportunities for deeper ecosystem integration with Apache Hadoop in simple and intuitive ways. We first partnered with OpenShift by Red Hat earlier this year when we made it possible to call out to Hadoop services from OpenShift via cartridges. You can read more about that solution here. As Enterprise Cloud (e.g. PaaS) offerings have matured to support a broad set of workloads, we’ve had a number of our customers ask about how Hadoop-centered Big Data and PaaS initiatives could work together – particularly in light of Apache Hadoop YARN being the multi-workload resource manager for batch, interactive and real-time workloads on Hadoop.…

BlueData™, a new Hortonworks Certified Technology Partner, is a pioneer in Big Data private clouds that help enterprises create a self-service cloud experience on premise. BlueData has been recently certified with Hortonworks Data Platform (HDP).

BlueData’s Director of Business Development, Rashmi Gopinath, describes the interworking and advantages of the BlueData EPIC™ solution with HDP.

Last week BlueData announced the launch of EPIC™ Enterprise, a Big Data private cloud solution, and its subsequent certification on Hortonworks Data Platform.…

Hortonworks’ strategy, since our inception, has been extremely consistent: enable a modern data architecture whereby users have the ability to store data in a single location and interact with it in multiple ways – using the right data processing engine at the right time.  At the core of that strategy is YARN, which as a part of Apache Hadoop, allows multiple data processing engines to interact with data stored in a single platform, unlocking an entirely new approach to analytics.…

Concurrent Inc. is a Hortonworks Technology Partner and recently announced that Cascading 3.0 now supports Apache Tez as an application runtime. Cascading is a powerful development framework for building enterprise data applications on Hadoop and is one of the most widely deployed technologies for data applications, with more than 175,000 user downloads a month. Used by thousands of businesses including eBay, Etsy, The Climate Corp and Twitter, Cascading is the de facto standard in data application development on Hadoop.…

Internet of Things (IoT) Potential and Process

It may seem obvious (or inevitable), but many companies are embracing the Internet of Things (IoT)—and for good reasons, notes Forbes’ Mike Kavis. For one, McKinsey Global Institute reports that IoT business will reach $6.2 trillion in revenue by 2025. And second, more and more objects are becoming embedded with sensors that communicate real-time data to data centers’ networks for processing, explain McKinsey’s Chui, Loffler, and Roberts.…

On September 17, the Apache Software Foundation (ASF) voted to graduate Apache Storm to a top-level project (TLP). This represents a major step forward for the project and represents the momentum built by a broad community of developers from not only Hortonworks, but also Yahoo!, Alibaba, Twitter, Microsoft and many other companies.

What is Apache Storm and why is it useful?

Apache Storm is a distributed, fault tolerant, and highly scalable platform for processing streaming data.…

ITC Infotech is a Hortonworks consulting and integration partner and provides IT services and solutions to leading global customers. The company addresses a wide range of customer challenges through innovative IT solutions.

Today, guest blogger Aditya Agrawal, head of Advance technology, ZLabs at ITC Infotech focuses on ITC’s RADAR framework for the Retail industry.

STORM and SOLR are excellent examples of new Hadoop tools that enable new use cases that were pretty hard to implement before.…

The Apache Tez community is thrilled to announce the release of version 0.5 of the project. We’re referring to this as “the developer release” because it’s all about developers. The community focused on meeting the key needs of developers using Tez to create their applications and engines. Tez 0.5 includes clean and intuitive developer APIs, easy debugging, extensive documentation and deployment with rolling upgrades.

Apache Hadoop YARN paved the way for Apache Tez.…

Summary

This blog covers how recent developments have made it easy to use ORCFile from Cascading or Apache Crunch and that doing so can accelerate data processing more than 5x. Code samples are provided so that you can start integrating ORCFile into your Cascading or Crunch projects today.

What are Cascading and Apache Crunch?

Cascading and Apache Crunch are high-level frameworks that make it easy to process large amounts of data in distributed clusters.…

Hortonworks is committed to collaborate with ISVs and partners to onboard their applications to YARN and Hadoop. As part of the YARN Webinar Series, we have introduced different methods to help you integrate your applications to YARN: Native YARN integration, Slider and Tez. As part of this series, we now offer the opportunity to learn Scalding, with guest speaker from Twitter, who will talk about simplifying application development on Apache Hadoop and YARN.…

Novetta is a new Hortonworks Technology Partner and recently achieved HDP 2.1 Certification and YARN Ready status. In this guest blog, Jennifer Reed, director of product management at Novetta, talks about Novetta’s YARN Ready entity resolution and relationship dimension-building application.

The New Era of Analytics

Thomas Davenport, in his keynote at the Hadoop Summit San Jose 2014, said that the big data analytics has entered a new phase: From Analytics 2.0 to 3.0.…

StackIQ, a Hortonworks technology partner, offers a comprehensive software suite that automates the deployment, provisioning, and management of Big Infrastructure. In his second guest blog, Anoop Rajendra (@anoop_r), a Senior Software Developer at StackIQ, gives instructions for using StackIQ Comand Line Interface (CLI) to deploy a Hortonworks Data Platform (HDP) cluster.

In a previous blog post, we discussed how StackIQ’s Cluster Manager automates the installation and configuration of an Apache Ambari server.…

Speed, Scale, and SQL Semantics

Since its inception and graduation as a Top Level Project (TPL) from Apache Foundation Project (ASF) in September 2010, Apache Hive has been steadily improving—in speed, scale, and SQL semantics—to meet enterprise requirements for both interactive and batch queries at Hadoop scale.

It has become a defacto standard for SQL queries over petabytes of data stored in Hadoop. It is a compliant SQL engine that offers familiarity to developers over a comprehensive and familiar set of SQL semantics for Apache Hadoop.…

In this partner guest blog, Microsoft Principal Software Development Engineer Eric Hanson weighs in how Stinger.next will benefit HDInsight customers. Coming from someone who worked on Microsoft SQL Server for years and is a committer to Apache Hive, Eric explains that Stinger.next initiatives and capabilities are essential to take Hive to the next level.

Apache Hive is one of the most-used features of Microsoft’s cloud Hadoop service, Azure HDInsight. So our HDInsight customers of course will enjoy new capabilities that make Hive faster.…

Go to page:12345...1020...Last »