Oracle and Hortonworks continue to work on bringing the latest ELT and real-time transactional data streaming capabilities to the Hortonworks Data Platform (HDP). Recently Oracle completed certification testing for HDP 2.2 for both Oracle Data Integrator and Oracle GoldenGate for Big Data, both integral parts of the Oracle Data Integration product portfolio. These releases certified on HDP 2.2 are the latest in the series of advanced Big Data updates and features that Oracle Data Integration is rolling out for customers to help take their Hadoop projects to the next level of enterprise integration.…
The Hortonworks Blog
- Business Values of Hadoop
- Why Hortonworks
- Industry Verticals
- Industry Happenings
- Deployment Options
- Types of Data
As businesses continue to create data at an ever-increasing pace, data architectures are strained under the loads placed upon them. Data volumes continue to grow considerably, low-value workloads like ETL consume more and more processing resources, and new types of data can’t easily be captured and put to use. Organizations struggle with escalating costs, increasing complexity, and the challenge of expansion.
This coming Wednesday, Big Data experts will look at how Hadoop is enabling a broad range of organizations to address these challenges.…
The components in a modern data architecture vary from one enterprise to the next and the mix changes over time. Many of our Hortonworks subscribers need support ensuring that their Hortonworks Data Platform (HDP) clusters are optimally configured. This means that they need proactive, intelligent cluster analysis.
As businesses onboard new workloads to the platform, it taxes the resources of Hadoop operators. And so our customers have asked Hortonworks for guidance and best practices to reduce their operational risk and efficiently resource their staff for Hadoop operations.…
Apache Hadoop has emerged as a critical data platform to deliver business insights hidden in big data. As a relatively new technology, system administrators hold Hadoop to higher security standards. There are several reasons for this scrutiny:
- External ecosystem that comprise of data repositories and operational systems that feed Hadoop deployments are highly dynamic and can introduce new security threats on a regular basis.
- Hadoop deployment contains large volume of diverse data stored over longer periods of time.
Hadoop isn’t optional for today’s enterprises—that much is clear. But as companies race to get control over the significantly growing volumes of unstructured data in their organizations, they’ve been less certain about the right way to put Hadoop to work in their environment.
We’ve already seen a variety of wrong approaches with proprietary extensions that limit innovation, fragment architectures and trade openness for vendor lock-in. Now a new consensus is forming around an emerging category that drives truly transformational outcomes: Open Enterprise Hadoop.…
Over the past two quarters, Hortonworks has been able to attract over 200 new customers. We are attempting to feed the hunger our customers have shown for Hadoop over the past two years. We are seeing truly transformational business outcomes delivered through the use of Hadoop across all industries. The most prominent use cases are focused on:
- Data Architecture Optimization – keeping 100% of the data at up to 1/100th of the cost while enriching traditional data warehouse analytics
- A Single View of customers, products, and supply chains
- Predictive Analytics – delivering behavioral insight, preventative maintenance, and resource optimization
- Data Discovery – exploring datasets, uncovering new findings, and operationalizing insights
What we have consistently heard from our customers and partners, as they adopt Hadoop, is that they would like Hortonworks to focus our engineering activities on three key themes: Ease of Use, Enterprise Readiness, and Simplification.…
Sumeet Kumar Agrawal, principal product manager for Big Data Edition product at Informatica, is our guest blogger. In this blog, explains how Informatica’s Big Data Edition integrates with Tez and allow for significant performance gains.
Informatica Big Data Edition’s codeless visual development environment accelerates the ability of enterprises to take advantage of amazing innovations in big data to solve new challenges with skill sets that already exist within many organizations. Informatica natively integrates with big data platforms like Hadoop and NoSQL to enable next-generation big data solutions, including data warehouse optimization and 360 customer analytics.…
Our guest blogger today is Sean Anderson, Manager of Data Service at Rackspace, the managed cloud company. Sean will share with us all the work Rackspace is doing with Hortonworks Data Platform (HDP) for an an Enterprise-ready Hadoop solution.
Rackspace is excited to be joining the open source data platform community for Hadoop Summit 2015 hosted by Hortonworks and Yahoo. We partnered with Hortonworks in 2013 to build two platforms—one that delivers enterprise-ready Hadoop on-demand in the cloud, and another that delivers customizable and secure dedicated servers backed by fanatical support and expertise.…
Last week, the Apache Slider community released Apache Slider 0.80.0. Although there are many new features in Slider 0.80.0, few innovations are particularly notable:
- Containerized application onboarding
- Seamless zero-downtime application upgrade
- Adding co-processors to app packages without reinstallation
- Simplified application onboarding without any packaging requirement
Below are some details about these important features. For the complete list of features, improvements, and bug fixes, see the release notes.Notable Changes: Containerized application onboarding
This release of Apache Slider provides a way to deploy containerized applications on YARN and leverage YARN’s resource management capabilities.…
This is a guest blog post from Jerry Megaro, Merck’s Director of Innovation and Manufacturing Analytics. Jerry established the practice of Data Excellence and Data Sciences within the Merck Manufacturing Division and now leads initiatives to transform Merck Manufacturing into a data-driven organization that enhances the company’s performance across the supply chain.
Hortonworks experience working with top pharma manufacturers indicates an exciting opportunity to improve manufacturing performance by proactively managing process variability.…
As we approach Hadoop Summit in San Jose next week, the debate continues over where Hadoop really is on its adoption curve. George Leopold from Datanami was one of the first to beat the hornet’s nest with his article entitled Gartner: Hadoop Adoption ‘Fairly Anemic’. Matt Asay from TechRepublic and Virginia Backaitis from CMSWire volleyed back with Hadoop Numbers Suggest the Best is Yet to Come and Gartner’s Dismal Predictions for Hadoop Could Be Wrong, respectively.…
Today I am excited to announce that we have made a significant expansion of our operations in Australia in response to growing demand for open enterprise Hadoop in Australia and around the APAC region.
Focused on Sydney but with the ability to execute across Australia, this year we have hired several senior sales and technical staff drawn from industry-leading technology vendors. With this additional experience, we are better able to help customers regionally with their big data needs.…
Not a day passes without someone tweeting or re-tweeting a blog on the virtues of Apache Spark.
At a Memorial Day BBQ, an old friend proclaimed: “Spark is the new rub, just as Java was two decades ago. It’s a developers’ delight.”
Spark as a distributed data processing and computing platform offers much of what developers’ desire and delight—and much more. To the ETL application developer Spark offers expressive APIs for transforming data; to the data scientists it offers machine libraries, MLlib component; and to data analysts it offers SQL capabilities for inquiry.…
Hortonworks and HARMAN are partnering to transform the automotive enterprise by enabling the connected car ecosystem with real-time, Internet of Things (IoT) data, insights and prognostics solutions.
The widespread adoption of connected devices is accelerating. Gartner Research expects 25 billion installed devices by 2020. Together, Hortonworks and HARMAN will offer solutions to help automotive manufacturers gain valuable insights by analyzing real-time information based on data streaming from connected cars.…
Data scientists use data exploration and visualization to help frame the question and fine tune the learning. Apache Zeppelin helps with this.
Based on the concept of an interpreter that can be bound to any language or data processing backend, Zeppelin is a web based notebook server.…