Open Data Spotlight: The Ultimate European Soccer Database Hugo Mathien
Whether you call it soccer or football, this sport is the world's favorite to watch and play. Thanks to Hugo Mathien who compiled, cleaned, and shared a dataset of stats on European professional football on Kaggle, it can become a data scientist's favorite playground, too. Among other data points, the database includes 25,000 matches from 2008 to 2011, 10,000 players from 11 countries, and betting odds from up to 10 providers. This impressive collection of data allows Kagglers test their machine learning techniques by building models predicting match outcomes (can you beat the bookies?) and find insights through data visualization and storytelling. In this interview, Hugo explains how he pulled data from a number of sources using Python's Scrapy and overcame data integrity issues with manual effort to build this incredible dataset for Kagglers to enjoy.
Aug-22-2016, 20:15:35 GMT
- Country:
- Europe
- France (0.05)
- Ireland (0.05)
- Portugal > Aveiro
- Aveiro (0.05)
- United Kingdom > England
- Greater London > London (0.05)
- Leicestershire > Leicester (0.05)
- North America > United States (0.05)
- Europe
- Industry:
- Leisure & Entertainment > Sports > Soccer (1.00)
- Technology: