The Hortonworks Blog

This guest post from Sofia Parfenovich, Data Scientist at Altoros Systems, a big data specialist and a Hortonworks System Integrator partner. Sofia explains she optimized a customer’s trading solution by using Hadoop (Hortonworks Data Platform) and by clustering stock data.

Automated trading solutions are widely used by investors, banks, funds, and other stock market players. These systems are based on complex mathematical algorithms and can take into account hundreds of factors.…

If you’re a Microsoft developer and stepping into Hadoop for the first time with HDP for Windows, then we thought we’d highlight this fantastic resource from Rob Kerr, Chris Campbell and Garrett Edmondson :  the MSBIAcademy.

They’ve produced a high quality, practical series of videos covering anything from essential MapReduce concepts, to using .NET (in this case C#) to submit MapReduce jobs to HDInsight, to using Apache Pig for Web Log Analysis.…

In this blog we’ll set up NFS for HDFS access with the Hortonworks Sandbox 1.3. This allows the reading and writing of files to Hadoop using familiar methods to desktop users. Sandbox is a great way to understand this particular type of access.

If you don’t have it already, then download the sandbox here. Got the download? Then let’s get started.

Start the Sandbox. Get to this screen.

We will now enable Ambari so that we can edit the configuration to enable NFS.…

Extracting insight from your machines, or customer sentiment data or any number of scenarios related to big data demands the integration of Hadoop into your data architecture to efficiently handle those new opportunities alongside the existing workloads. Over the next few months, we’re hosting a new webinar series along with partners to get to grips with what it means to integrate Hadoop into your data architecture.

The first three webinars in the series are listed below and ready for registration.…

By now, your Hadoop skills are becoming honed thanks to the effort you’ve put in, and we hope the Hadoop tutorials in the Hortonworks Sandbox have been helping you along the way. Today, we’re taking the next step in our quest to help you learn more about Hadoop: introducing the Hortonworks Sandbox Partner Tutorials.

The gallery extends the Sandbox, and in there you’ll find tutorials, demos and information on how to use and experiment with tools and applications from our partners – all part of real-world use of Hadoop.…

Today we released the Hortonworks Data Platform 1.3 for Windows for Windows Server 2008 R2 and 2012. This is an exciting major update to the only Enterprise Hadoop distribution on Windows. In this blog post, I will discuss what’s new and how to get started.

 Enabling new data applications

This release brings component parity to the HDP Stack across all operating systems by adding the following components:

  • Apache HBase (0.94.6.1) is a non-relational (NoSQL) database that runs on top of the Hadoop® Distributed File System (HDFS).

Today we are delighted to announce the release of Hortonworks Data Platform v1.3 for Windows. With this release, our HDP distributions for Hadoop have reached parity enabling seamless application portability across Linux and Windows platforms.

Hadoop represents the future of the enterprise data platform and we have made it our mission to deliver Hadoop as far and wide as possible: from Linux to Windows, from the Enterprise Data Center to the cloud and we’re very proud of this latest product release as we deliver on that mission.…

Thanks to all who joined us for last week’s webinar on Apache Hadoop YARN: Enabling Next Generation Data Applications. You can listen to the full webinar replay here, and the slides are embedded below.

If you’re already diving into YARN, then we will be hosting the first  ’Office Hours’ sessions at Hortonworks HQ. Join us on August 15th for a Deep Dive on Hoya (HBase on YARN)

Office hours will give you a chance to talk with those Hortonworks developers deeply involved with YARN and Hoya projects as well as your peers just launching their YARN projects.  …

Implementing and integrating Hadoop  to complement existing EDW, RDBMS and Discovery Systems is all part of realizing a Modern Data Architecture for a business which unlocks the opportunities that big data provides for new insight and competitive edge.

That is why we were excited to take part in Cisco and NetApp’s joint announcement of their FlexPod Portfolio because it brings new engineered offerings to the market for enterprises looking to take advantage of Hadoop.…

The Hortonworks Sandbox is a great tool for not only learning Hadoop, but also for experimentation and application development.  Deployment in a type 2 hypervisor such as Oracle VirtualBox or VMWare Workstation is straightforward and serves the need for a single user. Sandbox can also be deployed to IaaS environments, and in this case, we walk through the steps of deploying Hortonworks Sandbox on OpenStack. For the purposes of this article, the author has used OpenStack Grizzly release running QEMU-KVM as the underlying hypervisor.…

In the last Hoya article, we talked about the its Application Architecture. Now let’s talk persistence. A key use case for Hoya is:  support long-lived clusters that can be started and stopped on demand. This lets a user start and stop an HBase cluster when they want, only using CPU and memory resources when they actually need it. For example, a specific MR job could use a private HBase instance as part of its join operations, or for an intermediate store of results in a workflow.…

At Hadoop Summit in June, we introduced a little project we’re working on: Hoya: HBase on YARN. Since then the code has been reworked and is now up on Github. It’s still very raw, and requires some local builds of bits of Hadoop and HBase – but it is there for the interested.

In this article we’re going to look at the architecture, and a bit of the implementation.

We’re not going to look at YARN in this article -for that we have a dedicated section of the Hortonworks site -including sample chapters of Arun Murthy’s forthcoming book.…

If you’re considering the WHY, the HOW and the WHAT of Hadoop and Big Data in your business, then this collection of papers and ebooks is your friend.

  • WHY does Hadoop matter? Our eBook “Disruptive Possibilities of Big Data” paints a picture of the future of the data-driven business and how it changes everything.
  • HOW does Hadoop work in my data architecture? As part of a modern data architecture, Hadoop sits alongside existing infrastructure and augments its capabilities through Refining and Exploring big datasets and ultimately enriching the application and customer experiences for your business.

I’d like to share some thoughts on the recent news that Eric Baldeschwieler has decided to leave Hortonworks. I’d like to start off first by thanking Eric for his contributions to the Hadoop community since its inception over 7 years ago, and I’d like to express my personal appreciation for his help in getting Hortonworks off the ground.

It’s hard to believe it’s been over two years since Hortonworks was founded by over 20 engineers from the original Yahoo!…

After the break in the glorious hot weather we want to banish the rain and thunderstorms and bring back a lazy sunny London, so a few of us decided that it was time to hold the first “Big Data Lunch in the Park” summertime meet-up.

Register here at http://bigdatalunchinthepark.eventbrite.com 

Grab your lunch, divert your phone to your mobile and join us on the 8th August at noon at Green Park and hang out with some of your fellow Big Data enthusiasts.  …

Go to page:« First...10...1314151617...2030...Last »

Thank you for subscribing!