As the amount of data in our world continues to grow at an exponential rate, big data technologies and practices are evolving rapidly. Initially, companies focused on finding a cost-effective and scalable way to store and manage this data. By implementing technologies such as Apache Hadoop®, companies were able to store structured and unstructured data in a single data lake while reducing the operational costs of data warehouses and marts. Now, as they gain awareness of the strategic potential of their information resources, companies strive to analyze all this new data in new ways—moving beyond simple cost reduction to big data initiatives designed to drive competitive advantage, improve revenue and increase profitability.
In this white paper, we outline the Hortonworks approach to data science initiatives within an organization. In our view, businesses can greatly improve the success of these initiatives and extract maximum value via the following four recommendations:
• Provide a secure and rich data exploration environment for enterprise assets in the data lake
• Leverage a flexible platform that can support existing and emerging techniques
• Accelerate data science initiatives with self-service and collaboration
• Operationalize data science initiatives with full life cycle model management