Modernizing Data Archiving with Hadoop
Historical data is now an essential tool for businesses as they struggle to meet increasingly stringent regulatory requirements, manage risk and perform predictive analytics that help improve business decisions. And while recent data may be available from an enterprise data warehouse, the traditional practice of archiving old data offsite on tape makes business analytics challenging, if not impossible, because the historical information needed is simply unavailable.
Fortunately, the modern approach to data storage business analytics utilizes technologies like virtualization and big data Hadoop clusters to enable partitioned access to historical data. For example, Hortonworks partnered with Composite Software, now part of Cisco, to create a storage architecture that leverages this modern approach, giving a Global Investment Bank access to critical business data that was previously unavailable for analysis, and enabling the company to manage risk associated with its credit business. Here’s an interview with David Besemer, Data Virtualization CTO at Cisco explaining the challenges and solutions.
Q. What were the business needs and challenges of your customer, a bank focusing on global investments?
A. The Bank was looking to reduce risk in the credit business while making more profitable credit and bond derivative trading decisions. Additionally, the bank had to meet specific regulatory requirements, and those mandates were driving many of the decisions being made.
Specifically, they wanted to identify risk trends within five years of trading data. For example, measure the risk exposure in the bond portfolio by industry, region, credit rating and other parameters. Additionally, they were looking to analyze data across industries and over the years to see how those risks changed, and use that information to build a solid foundation for smart trading decisions.
Q. What were the main data integration challenges?
A. First, we needed to integrate massive amounts of data, including market data that was stored in the data warehouse, online data stored in the cloud, and Hadoop data that would be loaded from tapes. The online data covered 30 days or less but totaled up to about 400 million records, and the Hadoop data spanned the past five years of trading and market information.
Q. How did Cisco address those challenges?
A. Cisco, who recently acquired Composite Software, has a platform that uses data virtualization to integrate all three types of data for analysis, providing agility in both the storage and reporting layers. The platform virtualizes data from all sources, including Hortonworks Data Platform, and makes it readily accessible to standard reporting tools. This architecture allows the bank to easily and rapidly perform predictive analytics on both recent and historical data and uncover valuable business insights.
Cisco’s modern approach uses virtualization to make data from all sources—data warehouses, the cloud and Hadoop clusters—readily available for fast analysis.
Q. What results and technical benefits did you realize?
A. Using this approach, the bank was able to reduce risk much sooner than expected, adding $1 million of value back into the business. Immediately, there was a one-time benefit from faster time-to-market for key risk management initiatives. With access to more than 50 data sources, they now have better insight with greater visibility into trading activities. Additionally, they’ve seen development productivity increase substantially, for a savings of nearly $1.7 million to date, and ongoing recurring benefits associated with a greater number of deployed projects and data re-use.
Q. Can you share any feedback from the customer?
A. According to the head of data analytics and visualization for credit, the implementation was tremendously successful, especially with the limited resources they had to work with. Now, the bank can perform detailed analytics to support critical business decisions, which was impossible before implementing the platform.
Thank you Composite Software and Cisco for this use case.
Get Started using Hadoop to Analyze Data. This guide includes tutorials, videos and advice on integrating Hadoop with popular analytics packages.