If there’s one thing my interactions with our customers has taught me, it’s that Apache Hadoop didn’t disrupt the datacenter, the data did. The explosion of new types of data in recent years has put tremendous pressure on the datacenter, both technically and financially, and an architectural shift is underway where Enterprise Hadoop is playing a key role in the resulting modern data architecture.
The Hortonworks Blog
- Business Values of Hadoop
- Why Hortonworks
- Industry Verticals
- Industry Happenings
- Deployment Options
- Types of Data
Due to the flourish of Apache Software Foundation projects that have emerged in recent years in and around the Apache Hadoop project, a common question I get from mainstream enterprises is: What is the definition of Hadoop?
This question goes beyond the Apache Hadoop project itself, since most folks know that it’s an open source technology borne out of the experience of web scale consumer companies such as Yahoo!, Facebook and others who were confronted with the need to store and process massive quantities of data.…
We love to hear examples from the ecosystem of how organizations are benefiting from Hadoop and today Hortonworks partner Microsoft posted a great detailed case study on how one of their partners – Ascribe – is using Microsoft’s HDInsight Service, their cloud based 100% Apache Hadoop service to transform healthcare in the UK.
Ascribe is a UK based company focused on solutions for the healthcare industry and was an early adopter of HDInsight which is built using the Hortonworks Data Platform.…
We are delighted to host this is a guest blog from John Schitka at SAP.
Join us on March 12 to learn how SAP HANA and Hortonworks Data Platform combine to help you achieve Instant Insight and Infinite Scale — Register Here
Big Data is changing our world – enabling previously impossible insights and transforming the way we do business, work with others, and live our lives. To be competitive you need to lever Big Data and the business value it brings.…
Elasticsearch’s engine integrates with Hortonworks Data Platform 2.0 and YARN to provide real-time search and access to information in Hadoop.
See it in action: register for the Hortonworks and Elasticsearch webinar on March 5th 2014 at 10 am PST/1pm EST to see the demo and an outline for best practices when integrating Elasticsearch and HDP 2.0 to extract maximum insights from your data. Click here to register for this exciting and informative webinar!…
This is the fifth in our series on modern data architectures across industry verticals. Others in the series are:
- Modern Healthcare Architectures Built with Hadoop
- Modern Manufacturing Architectures Built with Hadoop
- Modern Telecom Architectures Built with Hadoop
- Modern Retail Architectures Built with Hadoop
Consumers have never generated so much data on how they research, discuss and buy products. This new data is valuable for shaping and promoting a brand or product, but it doesn’t line up neatly to fit in pre-defined, tabular formats.…
Hadoop can be a great complement to existing data warehouse platforms, such as Teradata, as it naturally helps to address two key storage challenges:
- Managing large volumes of historical or archival data.
- Handling data from non-standard or un-structured sources
The purpose of this article is to detail some of the key integration points and to show how data can be easily exchanged for enrichment between the two platforms.
As a data integrator who is familiar with RDBMS systems and is new to the Hadoop platform, I was looking for a simple way (i.e.…
Ever since I was a kid, I’ve used memorable movie quotes to help people understand a key point in a way that lightens the mood and generates some laughs. If you’re going to work hard, you gotta have fun, right???“Don’t make me angry… you wouldn’t like me when I’m angry”
The big data market is rife with aspirational marketing misinformation, which among other things causes customer confusion, slows the path to value, and frankly, makes me a little angry.…
With the growing number of large-scale enterprise deployments of big data, certain limitations have become more apparent bringing to light some weaknesses in this first phase of analytics infrastructures. Hadoop, clearly a very valuable tool for the collection of unstructured data, poses some challenges that need to be overcome for wide spread successful enterprise adoption.
In our upcoming webinar on Tuesday Feb 19 at 10 am PT, we will address these issues and highlight how to solve them using Hortonworks Data Platform and our partner Actian.…
We cannot wait to see you at the Santa Clara Convention for the next few days! Hortonworks will be one of the sponsors at the conference and will be presenting in various sessions. If you’re going to be around, attend one (or all) of our sessions and remember to stop by Booth #811. We have a nice schedule lined up for you and we hope you can join us!Attend our sessions
This year’s Strata Santa Clara, Hortonworks will also participate in a number of presentations on all things about data – don’t miss any of them!…
Microsoft and Hortonworks have been working together for over two years now with the goal of bringing the power of Big Data to a billion people. As a result of that work, today we announced the General Availability of HDP 2.0 for Windows with the full power of YARN.
There are already over half a billion Excel users on this planet.
So, we have put together a short tutorial on the Hortonworks Sandbox where we walk through the end-to-end data pipeline using HDP and Microsoft Excel in the shoes of a data analyst at a financial services firm where she:
- Cleans and aggregates 10 years of raw stock tick data from NYSE
- Enriches the data model by looking up additional attributes from Wikipedia
- Creates an interactive visualization on the model
Encryption is applied to electronic information in order to ensure its privacy and confidentiality. Typically, we think of protecting data as it rests or in motion. Wire Encryption protects the latter as data moves through Hadoop over RPC, HTTP, Data Transfer Protocol (DTP), and JDBC.
Let’s cover the configuration required to encrypt each of these protocols. To see the step-by-step instructions please see the HDP 2.0 documentation.RPC Encryption
The most common way for a client to interact with a Hadoop cluster is through RPC. …
Apache Sqoop is a tool that transfers data between the Hadoop ecosystem and enterprise data stores. Sqoop does this by providing methods to transfer data to HDFS or Hive (using HCatalog). Oracle Database is one of the databases supported by Apache Sqoop. With Oracle Database, the database connection credentials are stored in Oracle Wallet. Oracle Wallet can act as the store of keys and secrets such as authentication credentials. This post describes how Oracle Wallet adds a secure authentication layer for Sqoop jobs.…
Security is a top agenda item and represents critical requirements for Hadoop projects. Over the years, Hadoop has evolved to address key concerns regarding authentication, authorization, accounting, and data protection natively within a cluster and there are many secure Hadoop clusters in production. Hadoop is being used securely and successfully today in sensitive financial services applications, private healthcare initiatives and in a range of other security-sensitive environments. As enterprise adoption of Hadoop grows, so do the security concerns and a roadmap to embrace and incorporate these enterprise security features has emerged.…
In just a few years, interest in Hadoop has enjoyed a meteoric rise. It is everywhere… and it should be available everywhere.
Here at Hortonworks we have worked to provide the widest range of deployment options for Hadoop… from on-premises to the cloud, Linux and Windows, and from commodity server clusters to high-end appliances. Deployment options are critical to the adoption of Hadoop and a key factor to adoption.
Today, we add Ubuntu to the list of options we support for HDP 2.0.…