In baseball, Hadoop could provide insights traditional stats can’t

The "Moneyball" phenomenon in baseball is one of the most commonly cited and high-profile instances of an industry using data-driven insights to improve performance, but the ways professional baseball teams handle statistics are changing as teams attempt to account for more intangible variables. Experts have predicted that teams may soon begin using Hadoop clusters to make use of the unstructured data that accompanies traditional stats.

In a recent interview with CNBC, Paul DePodesta, vice president of player development and scouting for the New York Mets, explained that baseball's data-driven focus is only increasing and becoming more complex. Along with the growing interest in baseball data has come an increase in the amount of information that must be processed to actually make use of it.

Spotting substantive trends that accurately forecast player behavior is a major challenge, particularly as teams try to separate skill from luck in their analysis. According to TechCrunch contributor and venture capitalist Barry Eggers, that challenge has led at least one major league team to consider building a small Hadoop cluster. A future in which teams have locker room data scientists running in-game queries in HBase may not be far off.

"So why would a baseball organization need a Hadoop cluster?" Eggers wrote. "Because unstructured data may unlock insights that are not apparent from the structured event data that is available to every team."

Categorized by :
Big Data

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.