The Hortonworks Blog

Posts categorized by : Industry Happenings

Nothing happens in a vacuum anymore.  Cities now have the ability to use information collected from a massive variety of sources in order help solve common city problems.  The information can arise from anywhere – tweets, blog posts, and meter readings all can serve to inform public officials (and citizens as a whole) about how to better interact in a data-drenched world.

Most famously, IBM’s Smart Cities initiative looks at how city governments meet the needs of their expanding populations by using available resources more efficiently. …

Earlier, in the “Big Data in Genomics and Cancer Treatment” blog post, I explored how the extensive amount of information in DNA analysis mostly comes from the vast array of characteristics associated with people’s DNA make up and with different cancer variations. The case with today’s healthcare is very similar. Each patient is unique and has thorough medical history records that allow doctors to make evaluations and recommendations for future treatments.…

Small companies, big data.

Big data is sometimes at odds with the business-savvy entrepreneur who wants to exploit its full potential.   In essence, the business potential of big data is the massive (but promising) elephant in the room that remains invisible because the available talent necessary to take full advantage of the technology is difficult to obtain.

Inventing new technology for the platform is critical, but so too is making it easier to use.…

Last week was an important milestone for Hortonworks: our one year anniversary. Given all of the activity around Apache Hadoop and Hortonworks, it’s hard to believe it’s only been one year. In honor of our birthday, I thought I would look back to contrast our original intentions with what we delivered over the past year.

Hortonworks was officially announced at Hadoop Summit 2011. At that time, I published a blog on the Hortonworks Manifesto.…

The following is Part 2 of 2 on data in education. The first article introduces the concept and application of data in education. The second article looks at recent movements by the Department of Education in data mining, modeling and learning systems.

Big data analytics are coming to public education. In 2012, the US Department of Education (DOE) was part of a host of agencies to share a $200 million initiative to begin applying big data analytics to their respective functions.…

Big Data Shopping Bag

With big data basking in the limelight, it is no surprise that large retailers have been closely watching its development… and more power to them! By learning to effectively utilize big data, retailers can significantly mold the market to their advantage, making themselves more competitive and increasing the likelihood that they will come out on top as a successful retailer. Now that there are open source analytical platforms like Hadoop, which allow for unstructured data to be transformed and organized, large retailers are able to make smart business decisions using the information they collect about customers’ habits, preferences, and needs.…

We’re heading to our very first OSCON conference to talk all things Apache Hadoop, the biggest gathering for the entire open source community in Portland, Oregon, and we would love to meet you there!

Meet our founders, Arun Murthy and Mahadev Konar, along with others from the Hortonworks team at this year’s conference.

There are many ways to meet the Hortonworks team and we would love to chat with you about how you are considering using Hadoop.…

Working code examples for this post (for both Pig 0.10 and ElasticSearch 0.18.6) are available here.

ElasticSearch makes search simple. ElasticSearch is built over Lucene and provides a simple but rich JSON over HTTP query interface to search clusters of one or one hundred machies. You can get started with ElasticSearch in five minutes, and it can scale to support heavy loads in the enterprise. ElasticSearch has a Whirr Recipe, and there is even a Platform-as-a-Service provider, Bonsai.io.…

The following is Part 1 of 2 on data in education.  The first article introduces the concepts of how data is used in education.  The second article looks at recent movements by the Department of Education in data mining, modeling and learning systems.

Learning to Learn

The education industry is transforming into a 21st century data-driven enterprise.   Metrics based assessment has been a powerful force that has swept the national education community in response to widespread policy reform. …

What lessons might the anime (Japanese animation) “Ghost in the Shell” teach us about the future of big data?  The show, originally a graphic novel from creator Masamune Shirow, explores the consequences of a “hyper”-connected society so advanced one is able to download one’s consciousness temporarily into human-like android shells (hence the work’s title).  If this sounds familiar, it’s because Ghost in the Shell was a major point of inspiration for the Wachowski brothers, the creators of the  Matrix Trilogy.…

I wanted to take this opportunity to say thanks to the more than 2,200 attendees, speakers and sponsors that helped to make Hadoop Summit 2012 a great success. There was tremendous buzz throughout the conference; exceeding the excitement levels of all past Hadoop conferences. It’s a great indicator for the future of Apache Hadoop and the broader big data ecosystem.

The content from this conference was outstanding, from the opening keynotes to the last round of breakout sessions.…

What’s possible with all this data?

Data Integration is a key component of the Hadoop solution architecture. It is the first obstacle encountered once your cluster is up and running. Ok, I have a cluster… now what? Do I write a script to move the data? What is the language? Isn’t this just ETL with HDFS as another target?Well, yes…

Sure you can write custom scripts to perform a load, but that is hardly repeatable and not viable in the long term.…

Weather Hurts

Catastrophic weather events like the historic 2011 floods in Pakistan or prolonged droughts in the horn of Africa make living conditions unspeakably harsh for tens of millions of families living in these affected areas.  In the US, the winter storms of 2009-2010 and 2010-2011 brought record-setting snowfall, forcing mighty metropolises into an icy standstill. Extreme weather can profoundly impact the human kind.

The effects of extreme weather can send terrible ripples throughout an entire community. …

Big data. These are two words the world has been hearing a lot lately and it has been in relevance to a wide array of use cases in social media, government regulation, auto insurance, retail targeting, etc. The list goes on. However, a very important concept that should receive the same (if not more) recognition is the presence of big data in human genome research.

Three billion base pairs make up the DNA present in humans.…

By any measure, last week’s Hadoop Summit was a tremendous success. It brought together more than 2,200 people from throughout the Apache Hadoop ecosystem to share Hadoop knowledge, ideas, best practices, and interesting use cases. It was also a great chance for big data vendors to make announcements and demonstrate new and exciting solutions.

For those of you that missed the conference, or missed a particularly interesting presentation, we have some good news.…

Go to page:« First...1011121314...Last »