Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
April 13, 2016
prev slideNext slide

Hadoop, Data Lakes and Building a Next Gen Big Data Architecture

by Tendu Yogurtcu, PhD – General Manager of Big Data, Syncsort

This week, Hortonworks announced an exciting expansion of our long-standing partnership. Hortonworks will now resell Syncsort’s leading Hadoop data integration software, DMX-h for onboarding ETL processing in Hadoop. DMX-h will enable our joint customers to easily access and collect data from a diverse set of enterprise data sources, including RDBMSs, mainframe and emerging streaming data sources, bringing all of the data to the Hortonworks Data Platform (HDP™) for bigger insights and to drive business agility.

Our expanded partnership with Hortonworks is primarily targeting the challenges around big data integration, and onboarding the existing skills in the organization quickly with Hadoop.

This is the first time that Hortonworks, a leading innovator in open and connected platforms, has chosen to resell commercial partner software. We are excited that Hortonworks has chosen DMX-h and about the added value it can bring to Hortonworks customers.

Why did Hortonworks choose Syncsort’s data integration solution? It’s because of Syncsort’s unique value proposition, enabling organizations to access and integrate enterprise-wide data and bring all of the data to the Hadoop data lake. The key differentiators from any other data integration solutions include:

  • Easy and lightweight ETL deployment on-premise and in the cloud, and quick onboarding with Hadoop
  • Simple, secure and efficient approach to building the Hadoop data lake by accessing all enterprise data sources, including mainframe
  • Strong commitment and track record of contributing to the Apache Hadoop and Apache Spark open source projects yielding scalability, interoperability and future-proofing benefits with DMX-h’s native integration with Hadoop
  • Trustworthy global presence in 87% of enterprise Fortune 500 companies

Due to Syncsort’s ongoing contributions to Apache Hadoop projects, DMX-h is natively integrated into the Hadoop data flow, providing interoperability and scalability. DMX-h is deployed via Apache Ambari, out of the box integrated with security frameworks including Kerberos, Apache Ranger and Knox, and is integrated with HCatalog, providing data governance across platforms. Syncsort DMX-h’s ‘design once, deploy anywhere’ native architecture helped many organizations to seamlessly run their applications when migrating from MapReduce v1 to YARN, and guarantees the same future proofing for emerging compute platforms, such as Apache Spark.

As Scott Gnau’s recent blog post outlined, future proofing is critical to dealing with rapid change to the technology stack and a platform that can connect existing enterprise data whether from legacy mainframe or RDBMSs with the new streaming data systems will be key to the success of the big data initiatives. To ensure solid ROI, organizations must include all critical sources of enterprise data – including those that have been traditionally handled in a silo, such as mainframe, leveraging a cost-effective and scalable platform – such as HDP, and existing skill sets.

Syncsort and Hortonworks are committed to help enterprises overcome challenges in transitioning to the next generation data architecture. Our customers are already taking advantage of our joint offering. One of our Fortune 500 customers is using DMX-h on HDP to do customer churn analytics, improving customer service and driving operational efficiencies. They populate HDP with data from RDBMSs including Oracle, SQL Server, and DB2 on the mainframe. Syncsort automates metadata mapping from these data sources to Hive tables, ensuring secure access to data and populating the data lake. DMX-h is used for synchronizing the data on the cluster and also for preparing the data for advanced analytics.

This single software environment provides a data pipeline that can be used for both batch and streaming data, insulating the applications from the underlying compute frameworks, whether Hadoop MapReduce or Spark, on premise or in the cloud. Benefits to our customers are the rapid development of the application using existing skills, agility in integrating additional data sources such as streaming data patterns, and portability for future cloud deployment. All of these advantages are delivered securely and in keeping with regulatory compliance requirements, enabling organizations to bring new services and products to market quickly with increased return on investment.

Together, HDP and Syncsort DMX-h offer organizations a trusted solution for integrating ETL work flows with connected data platforms. Right now, at the Hadoop Summit in Dublin, Hortonworks and Syncsort are together at the Hortonworks booth to provide attendees with more information on the joint solution.



Dorine says:

Jeg skal ha en type alarm hvor jeg selv slukker alarmen med en gang
jeg har glemt magnetene i dørene og røkpuff
fra ovnen eller når lager mat her – den knappen må finnes på dette nye utstyret – som raskt slår av sirener !!

Leave a Reply

Your email address will not be published. Required fields are marked *