The Hortonworks Blog

Posts categorized by : Industry Happenings

The following press release was issued by Hortonworks today.

Hortonworks Announces General Availability of Hortonworks Data Platform

Industry’s First Apache Hadoop-based Platform to Include Management, Monitoring and Comprehensive Data Services, Making Hadoop Easy to Consume and Use in Enterprise Environments

Hadoop Summit is just around the corner and by that, I mean next week! There is still time to register for the conference but please do it soon as the conference is filling up quickly. Today is also the last day in which online registration will remain open. After today, you will need to register on-site at the conference itself.

This year’s Hadoop Summit conference, now in its fifth year, promises to be the biggest and best yet.…

Having worked at JBoss and Red Hat from 2004 to 2008 and SpringSource and VMware from 2008 to 2011, I’ve been focused on the world of open source software for a long while. I’ve been blessed to be able to serve enterprise customer needs with high quality open source software such as JBoss Application Server, Hibernate, Drools, Apache Web Server, Apache Tomcat, Spring … and now Apache Hadoop.

As specific open source technologies mature and their use becomes mainstream, it becomes increasingly important to understand and communicate the balancing act that needs to happen between community innovation and enterprise stability.…

Series Introduction

This is part two of a series of blog posts covering new developments in the Hadoop pantheon that enable productivity throughout the lifecycle of big data.  In a series of posts, we’re going to explore the full lifecycle of data in the enterprise: Introducing new data sources to the Hadoop filesystem via ETL, processing this data in data-flows with Pig and Python to expose new and interesting properties, consuming this data as an analyst in HIVE, and discovering and accessing these resources as analysts and application developers using HCatalog and Templeton.…

Series Introduction

This is part one of a series of blog posts covering new developments in the Hadoop pantheon that enable productivity throughout the lifecycle of big data.  In a series of posts, we’re going to explore the full lifecycle of data in the enterprise: Introducing new data sources to the Hadoop filesystem via ETL, processing this data in data-flows with Pig and Python to expose new and interesting properties, consuming this data as an analyst in HIVE, and discovering and accessing these resources as analysts and application developers using HCatalog and Templeton.…

Since joining Hortonworks at the beginning of the year, a question I’ve heard over and over again is “What is Apache Hadoop and what is it used for?”

There’s clearly a lot of hype [and confusion] in this emerging Big Data market, and it feels as if each new technology, as well as existing technologies, are pushing the meme of “all your data are belong to us”. It is great to see the wave of innovation occurring across the landscape of SQL, NoSQL, NewSQL, EDW, MPP DBMS, Data Marts, and Apache Hadoop (to name just a few), but enterprises and the market in general can use a healthy dose of clarity on just how to use and interconnect these various technologies in ways that benefit the business.…

I attended the Goldman Sachs Cloud Conference and participated on a panel focused on “Data: The New Competitive Advantage”. The panel covered a wide range of questions, but kicked off covering two basic questions:

“What is Big Data?” and “What are the drivers behind the Big Data market?”

While most definitions of Big Data focus on the new forms of unstructured data flowing through businesses with new levels of “volume, velocity, variety, and complexity”, I tend to answer the question using a simple equation:

Big Data = Transactions + Interactions + Observations

The following graphic illustrates what I mean:

In case you didn’t see the news today, Hadoop Summit announced record ecosystem support for this year’s conference. The original and world’s largest Apache Hadoop conference, now in its fifth year, is being sponsored this year by more than 40 traditional and open source software and services companies.

Hortonworks and our co-host Yahoo! would like to thank the following companies for helping to make Hadoop Summit possible:

As part of Big Data Week, Dan Harvey of the London Hadoop User Group organised an afternoon session for the usergroup, which we were glad to sponsor, along with Canonical and Facegroup. I had the pleasure of presenting my view of the current and future status of Apache Hadoop to an audience that ranged from those curious about Hadoop to heavy users.

Every talk of the day was excellent, from the use cases by Datasift, Mendeley and MusicMetric, to the talk by Francine Bennett of MastodonC on the CO2 footprint of different cloud computing infrastructures, including a live dashboard on the current CO2/hour of many cloud infrastructure sites.…

We are pleased to support today’s announcement from Citrix that they have contributed CloudStack to the Apache community. For those new to CloudStack, it is an open source cloud computing software that helps organizations build and manage cloud infrastructures. It is similar to Amazon Web Services EC2 environment except that it enables organizations to build public, private or hybrid cloud environments using their own pooled computing resources.

Citrix announced today that they were reaffirming their commitment to open source by working with the Apache Software Foundation to make CloudStack 3 an Apache project, released under Apache Software License 2.0.…

I’m pleased to announce the first in a series of videos featuring Hortonworks founders and executives sharing their thoughts on how Apache Hadoop is being extended to become the next generation enterprise data platform. Over the coming weeks and months, you will be hearing from folks such as Matt Foley, Arun Murthy, Sanjay Radia and Alan Gates, just to name a few.

The first video features Shaun Connolly, Hortonworks VP of Corporate Strategy, talking about the Hortonworks vision for Apache Hadoop.…

Thank you to the community members that cast over 8,000 votes during the Hadoop Summit Community Choice voting process. The turnout far exceeded our expectations and is further evidence that the momentum behind Apache Hadoop has never been stronger.

As we announced, the sessions with the most votes in each track are automatically accepted into the Hadoop Summit agenda. As such, I am pleased to announce the winners of the Hadoop Summit Community Choice vote and the first confirmed sessions in the Hadoop Summit program:

Future of Apache Hadoop track: Dynamic Namespace Partitioning with Giraffa File System, Konstantin Shvachko (eBay)

Deployment and Operations track: Dynamic Reconfiguration of Apache Zookeeper, Alexander Shraer and Benjamin Reed (Yahoo!)

Enterprise Data Architecture track: iMStor: Hadoop Storage-based Tiering Platform, Vishal Malik (Cognizant Technology Solutions)

Applications and Data Science track: Hadoop & Cloud @Netflix: Taming the Social Data Firehose, Mohammad Sabah (Netflix)

Analytics and Business Intelligence track: Mapping and Reducing Passenger Turbulence using Big Data, Farhan Hussain and Saad Patel (Open Source Architect)

Hadoop in Action track: The Merchant Lookup Service at Intuit, Vrushali Channapattan (Intuit)…

As I first mentioned when we announced Hadoop Summit 2012, we are focused on making Hadoop Summit the preeminent conference for the Apache Hadoop community. Today I’m pleased to tell you about Community Choice, a public online voting system that enables the entire Apache Hadoop community to have a say in the sessions chosen for Hadoop Summit. Anybody can vote and the top vote getters in each track will automatically be included in the Hadoop Summit agenda.…

Today we announced an important strategic partnership with Talend, provider of the world’s most popular open source data integration platform. This is another win for both Hortonworks customers and the larger Apache Hadoop community. There were two key aspects of the announcement that I wanted to highlight:

Talend releases Talend Open Studio for Big Data

Based upon Talend’s very popular open source data integration platform, Talend Open Studio for Big Data adds connectors for HDFS, HBase, Pig, Sqoop and Hive.…

Today we announced  that we were delivering on our earlier promise to help Microsoft bring Apache Hadoop to Windows. I’m pleased to share that Microsoft, with our collaboration and guidance, has now submitted a series of patches to Apache aimed at overcoming the challenges of running Apache Hadoop in Windows Server environments.

These patches, once vetted and approved by the community, will become part of the core Hadoop code base. They will also become available in the two major Apache Hadoop branches: hadoop-1.0 (the current stable branch, which is available as part of Hortonworks Data Platform v1.0) and hadoop-0.23 (the next generation of Apache Hadoop, which will be available as part of Hortonworks Data Platform v2.0).…

Go to page:« First...1011121314

Thank you for subscribing!