Category Archives: Hortonworks Topics


Teradata Aster & Hortonworks Webinar on Thursday

I wanted to draw your attention to a Webinar taking place this Thursday at 1pm EDT, 10am PDT. “Back to the Future – MapReduce, Hadoop and the Data Scientist” will highlight the benefits of Apache Hadoop and the role that data scientists are playing in big data. The speakers include:

  • Colin White – Founder of BI Research, a leading research, education and consulting firm helping companies understand and benefit from evolving and leading edge technologies in the areas of business intelligence and data management.
  • Tasso Argyros – Co-President of Teradata Aster
  • Ari Zilka – Chief Products Officer for Hortonworks

Among the topics discussed during this free Webinar are:

  • MapReduce for the data scientist: Hadoop/Hive and RDBMS approaches
  • Back to the future: file systems vs. database systems
  • Hadoop and RDBMS coexistence strategies
  • Bridging the gap: new approaches for analyzing data using Hadoop

This promises to be a very interesting and informative presentation so please Register today.

~ Lisa Sensmeier

Introducing Hortonworks Data Platform v1.0

I wanted to take this opportunity to share some important news. Today, Hortonworks announced version 1.0 of the Hortonworks Data Platform, a 100% open source data management platform based on Apache Hadoop. We believe strongly that Apache Hadoop, and therefore, Hortonworks Data Platform, will become the foundation for the next generation enterprise data architecture, helping companies to load, store, process, manage and ultimately benefit from the growing volume and variety of data entering into, and flowing throughout their organizations. The imminent release of Hortonworks Data Platform v1.0 represents a major step forward for achieving this vision.

You can read the full press release here. You can also read what many of our partners have to say about this announcement here. We were extremely pleased that industry leaders such as Attunity, Dataguise, Datameer, Karmasphere, Kognitio, MarkLogic, Microsoft, NetApp, StackIQ, Syncsort, Talend, 10gen, Teradata and VMware all expressed their support and excitement for Hortonworks Data Platform.

Those who have followed Hortonworks since our initial launch already know that we are absolutely committed to open source and the Apache Software Foundation. You will be glad to know that our commitment remains the same today. We don’t hold anything back. No proprietary code is being developed at Hortonworks.

Read More

An Advance Look at Hadoop Summit

Hadoop Summit is just around the corner and by that, I mean next week! There is still time to register for the conference but please do it soon as the conference is filling up quickly. Today is also the last day in which online registration will remain open. After today, you will need to register on-site at the conference itself.

This year’s Hadoop Summit conference, now in its fifth year, promises to be the biggest and best yet. In fact, there are already more people registered for Hadoop Summit 2012 than any other Hadoop conference ever!

I wanted to take this opportunity share some of the highlights for next week’s conference:

Geoffrey Moore and Other Compelling Keynote Speakers:

Geoffrey Moore, author of “Crossing the Chasm” and “Escape Velocity”, will share his views on “Digitizing the World, the Driving Force Behind Apache Hadoop’s Adoption Life Cycle”. You will also hear from other industry luminaries, who will share their vision for where Apache Hadoop is going and how it is destined to become the foundation for the next generation enterprise data platform.

Read More

Balancing Community Innovation and Enterprise Stability

Having worked at JBoss and Red Hat from 2004 to 2008 and SpringSource and VMware from 2008 to 2011, I’ve been focused on the world of open source software for a long while. I’ve been blessed to be able to serve enterprise customer needs with high quality open source software such as JBoss Application Server, Hibernate, Drools, Apache Web Server, Apache Tomcat, Spring … and now Apache Hadoop.

As specific open source technologies mature and their use becomes mainstream, it becomes increasingly important to understand and communicate the balancing act that needs to happen between community innovation and enterprise stability.

Community innovation needs to have a fast pace, where “ship early and often” is a key tenet.  Open source projects need to visibly improve and keep innovating if they are to attract a vibrant following. As the open source project’s community grows, they will expect big improvements and will be fine with early, buggy releases, etc. After all, that’s part of the process

Read More

Big Data Refinery Fuels Next-Generation Data Architecture

Since joining Hortonworks at the beginning of the year, a question I’ve heard over and over again is “What is Apache Hadoop and what is it used for?”

There’s clearly a lot of hype [and confusion] in this emerging Big Data market, and it feels as if each new technology, as well as existing technologies, are pushing the meme of all your data are belong to us. It is great to see the wave of innovation occurring across the landscape of SQL, NoSQL, NewSQL, EDW, MPP DBMS, Data Marts, and Apache Hadoop (to name just a few), but enterprises and the market in general can use a healthy dose of clarity on just how to use and interconnect these various technologies in ways that benefit the business.

In my post entitled 7 Key Drivers for the Big Data Market, I asserted that the Big Data movement is not only about the classic world of transactions, but it factors in the new(er) worlds of interactions and observations. This new world brings with it a wide range of multi-structured data sources that are forcing a new way of looking at things.

Read More

7 Key Drivers for the Big Data Market

I attended the Goldman Sachs Cloud Conference and participated on a panel focused on “Data: The New Competitive Advantage”. The panel covered a wide range of questions, but kicked off covering two basic questions:

“What is Big Data?” and “What are the drivers behind the Big Data market?”

While most definitions of Big Data focus on the new forms of unstructured data flowing through businesses with new levels of “volume, velocity, variety, and complexity”, I tend to answer the question using a simple equation:

Big Data = Transactions + Interactions + Observations

The following graphic illustrates what I mean:

Read More

Executive Video Series: Introduction to HCatalog

We just added a video to the Hortonworks Executive Video library that features Alan Gates, Hortonworks co-founder and Apache PMC member. In this video, Alan discusses HCatalog, one of the most compelling projects in the Apache Hadoop ecosystem.

HCatalog is a metadata and table management system that provides a consistent data model and schema for users of tools such as MapReduce, Hive and Pig. When you consider that there are often users accessing Hadoop clusters using different tools that independently don’t agree on schema, data types, how and where data is stored, etc., then you can understand the value of having a tool such as HCatalog.

In this video, Alan does a good job of not only explaining the role of HCatalog, but also laying out the future direction of the project. He talks about improving the integration with HBase, improving information lifecycle management and expanding the HCatalog data model to address the challenges of unstructured data.

Executive Video Series: Apache Hadoop and Next Generation MapReduce

The third installment of the Hortonworks executive video series features Arun C. Murthy, co-founder of Hortonworks and VP of Apache Hadoop for the Apache Software Foundation. In this video, Arun shares his view of the power of Apache Hadoop and provides some insight into the future direction of MapReduce, including the ability to support alternate programming paradigms.

Read More

Executive Video Series: Overview of Hortonworks Data Platform

We just released the second video in the Hortonworks Executive Series. This one features Matt Foley, Test and Release Engineering Manager for Hortonworks.

In this video, Matt provides an overview of Hortonworks Data Platform (HDP), including a summary of the Apache Hadoop components included in the distribution and the testing processes involved in the release process. Matt also provides an overview of Apache Ambari, an open source project that is adding monitoring and management capabilities to Apache Hadoop.

Read More

Hortonworks Welcomes Citrix and CloudStack to the Apache Community

We are pleased to support today’s announcement from Citrix that they have contributed CloudStack to the Apache community. For those new to CloudStack, it is an open source cloud computing software that helps organizations build and manage cloud infrastructures. It is similar to Amazon Web Services EC2 environment except that it enables organizations to build public, private or hybrid cloud environments using their own pooled computing resources.

Citrix announced today that they were reaffirming their commitment to open source by working with the Apache Software Foundation to make CloudStack 3 an Apache project, released under Apache Software License 2.0. This is yet further acknowledgement that Apache is the logical home for open source projects that are transforming the enterprise software industry. As a Gold Sponsor of the ASF and major contributor to Apache projects, Hortonworks is pleased that leading vendors such as Citrix are recognizing the value that Apache can provide in terms of accelerating development and innovation and driving adoption as the preferred destination for enterprise-class open source software.

Read More

Executive Video Series: The Hortonworks Vision for Apache Hadoop

I’m pleased to announce the first in a series of videos featuring Hortonworks founders and executives sharing their thoughts on how Apache Hadoop is being extended to become the next generation enterprise data platform. Over the coming weeks and months, you will be hearing from folks such as Matt Foley, Arun Murthy, Sanjay Radia and Alan Gates, just to name a few.

The first video features Shaun Connolly, Hortonworks VP of Corporate Strategy, talking about the Hortonworks vision for Apache Hadoop. In this video, Shaun does a nice job of outlining our vision that Apache Hadoop will process or touch half of the world’s data by 2015. How is Hortonworks helping to make this happen? Click on the video image below to find out.

Read More

The Importance of the Teradata & Hortonworks Partnership

Hortonworks and Teradata announced a strategic relationship today that includes joint go-to-market and development work to more closely integrate Hortonworks Data Platform with the Teradata Analytical Ecosystem. I wanted to take the opportunity to highlight this important partnership and share my thoughts on why this is an important milestone for Hortonworks and the larger Apache Hadoop community.

As somebody that has been heavily involved in the development of Apache Hadoop for six years and counting, it’s personally exciting to see Hadoop entering a new phase of adoption. Hadoop has been heavily used in organizations such as Yahoo!, Facebook, Linked In and other large web properties since 2006. Over the past couple of years, we’ve seen a surge in the number of organizations testing Hadoop in proof-of-concept or pilot projects but it hasn’t yet reached massive adoption in production in the enterprise.

Read More

Reaffirming our Commitment to 100% Pure Open Source

I’ve been surprised by a couple of recent articles highlighting our recent leadership change.  These articles imply that our business model may be changing. Let me be clear, WE ARE NOT CHANGING OUR BUSINESS MODEL. We are committed to providing training and support of a 100% open source distribution of Apache Hadoop and related projects.

What has changed?

Rob Bearden has agreed to take on the role of CEO. I am moving from CEO to the role of CTO.

Read More

Announcing Hortonworks University

One of the common themes that we hear from customers, partners, industry analysts and others in the community is that there is massive need for Apache Hadoop education. The demand for trained and certified Hadoop professionals far exceeds the current supply and this knowledge gap is threatening to slow the rapid adoption of Hadoop. To address this challenge, Hortonworks is pleased to announce Hortonworks University.

Hortonworks University consists of public, private on-site and live online courses for both developers and administrators. Our courses are role-based and consist of both expert content that leverages Hortonworks deep domain expertise, and hands-on labs that prepares students for the real-world Hadoop scenarios they will face. The labs in particular set Hortonworks apart from other training offerings in the market place. As the creators of much of the Apache Hadoop code, we have an exceptional understanding of the essential Hadoop components. We have spent countless hours applying this expertise to create highly valuable hands-on exercises that enable students to immediately implement what they have learned. We don’t simply “inform” students. We truly “enable” them to be successful.

Read More

Go to page:« First...23456