Hadoop Ecosystem

Industry news, partner stories, buzz and happenings

There are plenty of server and storage options for the wave of data that is being collected and analyzed.  New platforms such as Apache™ Hadoop® provide the opportunity to make all the new data types being collected useful.  However, like any other platform, performance varies depending on the underlying servers being used.  There is great promise in what Hadoop can deliver in terms of business value, and the ecosystem is continuously growing with companies making strides to make Hadoop easier to deploy and manage.…

This week we’re at the Red Hat Summit along with many others enjoying the great discussions within the community. As part of the summit, we are delighted to announce extended collaboration with Red Hat to continue to advance open source big data community projects.

Some details on the the three areas of collaboration forming the announcement:

  • Enhancing Apache Ambari to support the management of Hadoop-compatible file systems, such as GlusterFS. With this integration, users will be able to provision, deploy, monitor and manage alternative file systems with Ambari, further cementing Ambari’s position as the standard for Hadoop management.

Talend Open Studio for Big Data provides an intuitive set of tools that make dealing with data in the Hadoop world (and Hortonworks Data Platform in particular) a lot easier.  We often use the tools often to speed delivery of a proof of concept or to operationalize movement of data from sources like web logs and machine sensors to load HDFS.  It is simple to use and typically takes only minutes to perform something that once took hours in a script.…

Hadoop Summit 2013 in San Jose is approaching quickly and in just a few weeks attendees will have the opportunity to learn all of the up and coming advances in the world of Apache Hadoop and Big Data. You can still register here!

Here are ten great reasons to pencil “Hadoop Summit 2013” into your calendar:

  • Informative and exciting keynotes
    Keynotes will be given by Jer Thorpe, an artist and educator known for exploring the many-folded boundaries between science, data, art and culture and Merv Adrian, VP of research at Gartner who follows database, big data, NoSQL and adjacent technologies.…
  • The Hadoop goodness just keeps on flowing as we’ve delivered new releases and new content in the past 10 days. Let’s recap.

    HDP 1.3 Release. This milestone release takes advantage of improved performance in Hive 0.11 along with delivery on a series of enterprise requirements including NFS access to HDFS, improved MTTR for HBase, business continuity through HDFS and HBase snapshots, optimized connectors to Oracle and Netezza and the latest release of Ambari for management and operations.…

    HDP 1.3 release delivers on community-driven innovation in Hadoop with SQL-IN-Hadoop, and continued ease of enterprise integration and business continuity features.

    Almost one year ago (50 weeks to be exact) we released Hortonworks Data Platform 1.0, the first 100% open source Hadoop platform into the marketplace.  The past year has been dynamic to say the least!  However, one thing has remained constant: the steady, predictable cadence of HDP releases.  In September 2012 we released 1.1, this February gave us 1.2 and today we’re delighted to release HDP 1.3.…

    One of the goals with the Hortonworks Sandbox is around showcasing end-to-end use cases for Hadoop. With the most current release of Hadoop tutorials, you’ll find 2 specific use cases highlighted both around utilizing clickstream data.   There are 6 new tutorials for you to walk through – Tutorials 6 – 11.

    (Update: if your version of Sandbox does not have “Enable Ambari” on the introductory page, you will need to download the latest version of the Sandbox in order to have access to these tutorials.)

    Clickstream Analysis – Website User Behavior

     

    Tutorials 6-10 are extensive, step-by-step lessons to walk you through the process to connect the Sandbox to Excel 2013 via the Hortonworks ODBC driver to access and analyze semi-structured data (like Omniture logs).…

    We are excited to release the Hortonworks Data Platform 1.1 for Windows as a Generally Available product. In this blog post, I’m going to outline how to get started with HDP 1.1 for Windows.

    With HDP for Windows, you can deploy Apache Hadoop and the HDP stack of components natively on a Windows Server cluster. The HDP for Windows download includes an MSI and remote installation scripts. With these artifacts, you can setup a multi-node Hadoop cluster in either a Workgroup or Active Directory Domain networking configuration.…

    Today we announced a strategic alliance with operational intelligence leader Splunk. We are excited to be strengthening our relationship with Splunk and expanding the Apache Hadoop ecosystem and we expect this to further drive open source innovation. Additionally this alliance is further proof of Hadoop’s maturation as a key component of the next generation enterprise architecture.

    One of the key benefits of the partnership is that it enables organizations to easily take advantage of the massive scale out storage and processing capabilities of Apache Hadoop with Splunk Enterprise via Splunk Hadoop Connect, which easily and reliably moves data between Splunk Enterprise and Hadoop.…

    Today we are very excited to announce that Hortonworks Data Platform for Windows (HDP for Windows) is now generally available and ready to support the most demanding production workloads.

    We have been blown away with the number and size of organizations who have downloaded the beta bits of this 100% open source, and native to Windows distribution of Hadoop and engaged Hortonworks and Microsoft around evolving their data architecture to respond to the challenges of enterprise big data.…

    UPDATED 6/12: To include the Falcon meetup.

    UPDATED: To include the Oozie meetup.

    The main Hadoop Summit agenda is looking awesome – go take a look here, and register here - but there’s also a series of meetups planned for the day before the general sessions. If you want to get up close and personal on topics of interest to you with other like-minded folk then take a look at these options.…

    And we are just about done with this week. But not quite – dig into the conversation from the past few days.

    Hadoop Summit. We published the vast majority of sessions (70 so far) for the Hadoop Summit in San Jose, 26-27 June. The sessions stretch across 7 tracks from Architecture to Economics and we hope you can join us for THE Hadoop community event of the year. You can register here, and the schedule is here.…

    Some news from the UK as Yahoo! Hack Europe welcomed Hortonworks this past weekend in central London.  This two-day event sponsored by Yahoo! was focused on celebrating collaboration, learning and innovation using the worlds leading technologies.  Chris Harris, our local EMEA Solution Engineer was on hand to add to the discussions.  Partnering with Microsoft, we were able to showcase our HDP on the Azure platform.  This was a fantastic opportunity for the 350 delegates to be expose to both Azure and enterprise ready Hadoop provided as HDInsight Service.…

    Now is the time to get registered for the Hadoop Summit in San Jose, 26-27 June, 2013 – we’d love to see you there. A few weeks ago, we revealed the selectees from the community choice voting, and we’re now delighted to announce the full schedule of sessions is available here.

    Session Schedule

    Our thanks to the track selection committees and track chairs for the work on building a great schedule for an awesome event.…

    To deploy, configure, manage and scale Hadoop clusters in a way that optimizes performance and resource utilization there is a lot to consider. Here are  6 key things to think about as part of your planning:

  • Operating system:  Using a 64-bit operating system helps to avoid constraining the amount of memory that can be used on worker nodes. For example, 64-bit Red Hat Enterprise Linux 6.1 or greater is often preferred, due to better ecosystem support, more comprehensive functionality for components such as RAID controllers.…
  • Go to page:« First...7891011...Last »