Hadoop Ecosystem

Industry news, partner stories, buzz and happenings

Do you like looking for the needle in the field of haystacks? Do I have a job for you; security operations center (SOC) analyst. You will spend your days looking at hundreds of thousands of alerts – created by rules engines – where only a very few a week actually matter.  Your job is to manually review all of them, filtering out the noise to find the few that matter.  Yes, it will take hours to review each one and there won’t be enough time in the day to review them all; but, what can you do?…

Hello everyone and welcome to the start of my blogging adventure. I’m Mike Schiebel, Cybersecurity Strategist at Hortonworks where I’m focused on cybersecurity to inject enterprise level security features into the Hadoop ecosystem and provide input into the Apache Metron open source project.  I figured introductions are in order, to explain the where and why behind my blog series.

Who am I?

I’ve taken a long and twisting road before ending up at Hortonworks.…

In September, Hortonworks partnered with ManTech and B23 to foster a vibrant open community to accelerate the development of OpenSOC. In December we additionally partnered with Rackspace Managed Security and submitted OpenSOC to the Apache Incubator as a podling under the name of Apache Metron. A decision to rename the project was made to represent the new direction and the new community. Now the process of graduating Metron to a top-level project (TLP) has begun.…

An interesting and atypical thing is happening in Healthcare. Leading data driven organizations are not simply looking to share their Hadoop experiences, successes, use cases, and best practices … but more than ever before, they are embracing the opportunity to share their experiences outside their organizations, in a style that resembles the open source community on which Hadoop was built.

It all started at Hadoop Summit on June 10th 2015 when a simple breakfast meeting was organized to showcase the experiences of a couple of healthcare’s earliest adopters of Hadoop.…

It’s our pleasure to host Ryan Peterson, Chief Solution Strategist at EMC, as a guest blogger to expand upon another great step in our partnership to deliver compelling customer solutions through joint engineering efforts.  Follow Ryan @BigDataRyan.

Object storage isn’t a new concept and EMC’s been innovating around it since the beginning. Take our Centera and Atmos products as key examples. The first Centera was created around the idea that objects could store much higher quantities of data than a file system in a single store while the other aspect of Centera was a rich set of security and compliancy features file systems had not been able to achieve.…

Deploying a lock-in free data platform is critical for an enterprise. By this, we mean using a non-proprietary code and implementing interoperability to eliminate the risk of being dependent on a single vendor for your current or future needs.

Over two thirds of respondents to our survey agree that maintaining freedom of choice was a key criterion when it came to selecting the Hortonworks Data Platform. (Source: TechValidate TVID 4A8-731-250.)

They didn’t want to be limited to what one vendor can offer – they wanted to have platform portability, industry-wide standards and choices on third party application support from a broader ecosystem.…

Customers who are dealing with massive growth of large unstructured data sources need to evaluate their storage architecture for scalability, flexibility and ease of management. In a recent study, performed by the Enterprise Strategy Group, IT decision makers were asked to identify their biggest storage challenges. In addition to rapid data growth rate, they also pointed to hardware costs, data protection and staffing costs. Object-based storage solutions aim at addressing a number of these challenges, and EMC’s answer comes with Elastic Cloud Storage (ECS).…

“It’s all about Hortonworks company vision, 100% open source and enterprise support.” Source: TechValidate TVID 8A7-EFF-21C

Hortonworks’ customer experience survey shows that our community innovation strategy is validated by our customers, with more than two-thirds of those who responded to the survey said they value community innovation

Hortonworks has been dedicated to 100% open community development since the very beginning because this strategy maximizes value we can bring to our customers.…

The modern enterprise requires a comprehensive end-to-end data management solution capable of leveraging advanced machine learning to identify and manage risk; as well as a repository capable of capturing and processing the data necessary to support this solution.

Now more than ever, organizations are subject to privacy and data security laws and complying with these regulations is exceptionally challenging given the complexity of data that enterprises now have to manage. However, one only needs to pick up a newspaper to read about the dire consequences when companies fail to take proper safeguards to comply with privacy and data security laws.…

Our guest blogger today is Nyla Beth Gawel, Manager of Booz Allen Hamilton’s Internet of Things Practice. Booz Allen Hamilton, one of our strategic System Integrators, describes how they help customers with their IoT analytics using Hortonworks DataFlow (HDF), powered by Apache NiFi.

Consumer Internet of Things (IoT) has taken off in forms ranging from wearable technology to smart home devices to remote patient care. On the other hand, Enterprise IoT is still struggling to realize the promise of the connected workplace.…

Our guest blogger is Bob Taylor, Alliances Director at Concurrent, a Hortonworks Technology Partner. In this blog, Bob describes three factors that helped in the success of HomeAway in their big data initiative and are applicable to all projects. HomeAway is a customer of Hortonworks and Concurrent.

HomeAway is a great example of an organization that has found value from their Big Data investment because of three factors. One HomeAway initiative gathers customer preference data from dozens of websites and uses it to refine their marketing and, in turn, increase bookings.…

Earning the prestigious Teradata EPIC award is no easy feat. Partners who would like to have a shot at winning the top recognition need to demonstrate how their solution provides a unified, high-performance big data analytics system for an enterprise and show measurable return on investment. After receiving Teradata’s EPIC award recognition for Big Data Intelligence in 2013 and 2014, Hortonworks, yet again, has been recognized as the leader by winning this award for the third year in a row.…

Apache Spark’s momentum continues to grow and throughout 2015 we saw customers across all industries get real value from using it with the Hortonworks Data Platform (HDP). Examples include:

Insurance Optimize their claims reimbursements process by using Spark’s machine learning capabilities to process and analyze all claims. Healthcare Build a Patient Care System using Spark Core, Streaming and SQL. Retail Use Spark to analyze point-of-sale data and coupon usage. Internet Use Spark’s ML capability to identify fake profiles and enhance products matches that they show their customers.…

Is a Lake Big Enough to House Your Ocean of Data?

Contrary to popular belief, Hadoop was not the elephant-in-the-china-shop that marauded and disrupted the data center. The real culprit is data and how it has exploded in volume. The past two or three years have seen a rise in the number of successful Hadoop projects in enterprises to tackle this explosion of big data. These large volumes of data, the emergence of the Hadoop technology and the need to store all the siloed data in one place have prompted the phenomenon called the Data Lake among enterprises.…

Our guest blogger today is Rob Rosen, Senior Director Partner Solutions at Platfora, describes how to help customers achieve strategic advantage through data discovery.

While many people have heard the notion of “known unknowns” and “unknown unknowns,” it may surprise you to discover that the concept was first popularized by a NASA scientist. In a presentation given at TEDx GeorgeMasonU, Dr. Kirk Borne described how he used the concept of “known unknowns” (things that we knew might exist, but hadn’t seen evidence of) and “unknown unknowns” (things that we could discover and knew nothing about, but would truly surprise us), and how they relate to the concept of Big Data.…