Enterprise Data Warehouse (EDW) is an organization’s central data repository that is built to support business decisions. EDW contains data related to areas that the company wants to analyze. For a manufacturer, it might be customer, product or bill of material data. EDW is built by extracting data from a number of operational systems. As the data is fed into EDW it is converted, reformatted and summarized to present a single corporate view. Data is added into the data warehouse over time in the form of snapshots and normally an enterprise data warehouse contains data spanning 5 to 10 years. A Hadoop data warehouse architecture enables deeper analytics and advanced reporting from these diverse sets of data.
Problems with a typical EDW
The Enterprise Data Warehouse has become a standard component of the corporate data architectures. However, the complexity and volume of data has posed some interesting challenges to the efficiency of existing EDW solutions.
Realizing the transformative potential of Big Data depends on the corporations’ ability to manage complexity while leveraging data sources of all types such as social, web, IoT and more. The integration of new data sources into the existing EDW system will empower corporations more and deeper analytics and insights. More importantly, EDW optimization using Hadoop provides a highly cost-efficient environment with optimal performance, scalability and flexibility.
Hortonworks Data Platform
Powerful open Hadoop data warehouse architecture with capabilities for data governance and integration, data management, data access, security and operations—designed for deep integration with your existing data center technology. Learn More
EDW offload to Hadoop - High-performance ETL software to access and easily onboard traditional enterprise data to HDP. Learn More
High performance analytical engine for interactive BI on Hadoop Big Data. Learn More
Expert guidance and support to quickly prove the value of your new architecture and maximize the value of the full tested and validated Hortonworks data architecture optimization solution. Learn More
EDW optimization with Apache Hadoop ®
Data can be loaded in HDP without having a data model in place
Data model can be applied based on the questions being asked of data (schema-on-read
HDP is designed to answer questions as they occur to the user
100% of the data is available at granular level for analysis
HDP can store and analyze both structured and unstructured data
Data can be analyzed in different ways to support diverse use cases
HDP (Hortonworks Data Platform) is 100% open - there is no licensing fee for software
HDP runs on commodity hardware
New data can be landed in HDP and used in days or even hours
Use-Cases on EDW Optimization
ONBOARD ETL PROCESSES TO HADOOP
A typical EDW spends between 45 to 65 percent of its CPU cycles on ETL processing.These lower-value ETL jobs compete for resources with more business-critical workloads and can cause SLA misses. Hadoop can EDW offload these ETL jobs with minimal porting effort and at substantially lower cost, saving money and freeing up capacity on your EDW for higher-value analytical workloads. Hortonworks makes it easy by providing high-performance ETL tools, a powerful SQL engine and integration with all major BI vendors.
Increasing data volumes and cost pressures force many companies to archive old data to tape where it can’t be analyzed or must be retrieved at great expense.
A Hadoop data warehouse architecture offers cost per terabyte on par with tape backup solutions. Because of the appealing cost, you can store years of data rather than months. All of your enterprise data remains available for retrieval, query and deep analytics with the same tools you use on existing EDW systems.
Proprietary EDW systems were adopted for fast BI and deep slice-and-dice analytics, but EDW prices are unsustainably high and these systems have not adapted to modern big data challenges like unstructured data and large-scale analytics.
Hortonworks and JethroData makes fast BI on Hadoop a reality, with the combination of a fast in-memory SQL engine to create data marts with a high performance analytical engine for interactive BI that lets you support 1,000’s of concurrent users and query huge datasets in seconds. JethroData dynamically will create indexes and cubes, lowering the upfront setup cost and time. Standard access to all your Hadoop data is supported with your existing BI tools like Tableau, Qlik or Microstrategy.
Guest post by Boni Bruno. Originally published for the Dell EMC Community. Many organizations use traditional, direct attached storage (DAS) Hadoop clusters for storing big data. As data requirements grow,...
EDW is not dead; it’s evolving! Enterprise data warehouses have come a long way in delivering value by predicting trends, minimizing churn, and identifying new business opportunities. However, in the era...
Hadoop’s data analytics capabilities offer tremendous potential for deriving new and differentiated business insights. But, many organizations get bogged down with the DIY infrastructure decisions and fail to keep up...
LLAP wins the fastest execution among the SQL engines! Comcast is one of the nation's leading providers of communications, entertainment and cable products and services. Headquartered in Philadelphia, PA, they...
Last week at Dataworks Summit, Dell EMC released the Dell EMC Ready Bundle for Hortonworks Hadoop. Dell EMC and Hortonworks brings together industry leading solutions for enterprise-ready open data platforms...
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, Nifi Registry, HAWQ, Zeppelin, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.