Freddie Mac makes home possible for millions of families and individuals by providing mortgage capital to lenders. Since our creation in 1970, we’ve made housing more accessible and affordable for homebuyers and renters in communities nationwide. We are building a better housing finance system for homebuyers, renters, lenders and taxpayers.
KPMG, headquartered in Netherlands, is a global network of professional firms providing audit, tax, and advisory services.
Technology leaders at both Freddie Mac and KPMG have developed a framework to accelerate the “data wrangling” process so our businesses can draw insights and provide feedback to customers and internal stakeholders as soon as a product is launched.
Next month, at the San Jose DataWorks Summit (June 13-15), both myself and a colleague from KPMG will present our Big Data achievement and results.
A Freddie Mac and KPMG Case Study: PySpark for Advanced Analytics and Insights over Semi-Structured Data
Freddie Mac and KPMG have developed a common, generic, self-learning data engineering framework to handle and manage all data integration challenges from multiple sources with one solution. The reusable, extensible program executes against multi-dimensional, semi-structured XML data sets, leverages Jupyter Notebook as well as core Apache components of Hortonworks Data Platform (such as Spark, Hive, Oozie and Zeppelin). By leveraging PySpark and other tools, the modular architecture provides faster, easier data processing with lower development cost.
The resulting analytics allow Freddie to extract knowledge and insights to roll out new product capabilities, risk monitoring, Quality Control sampling and Fraud Analytics. The application runs the processes in a highly distributed and memory-intensive framework to reduce processing time. This high-level overview of the Freddie Mac Big Data Solution will share best practices to generically process semi-structured data while retaining the complex structures needed by data scientists and teams focused on advanced analytics.