machine learning mania 2016
March Machine Learning Mania 2016, Winner's Interview: 1st Place, Miguel Alomar
The annual March Machine Learning Mania competition sponsored by SAP challenged Kagglers to predict the outcomes of every possible match-up in the 2016 men's NCAA basketball tournament. Nearly 600 teams competed, but only the first place forecasts were robust enough against upsets to top this year's bracket. In this blog post, Miguel Alomar describes how calculating the offensive and defensive efficiency played into his winning strategy. I earned a Master's Degree in Computer Science from UIB in Mallorca, Spain. For nearly 20 years, I have been involved in software development, business intelligence and data warehousing.
February 2016: Scripts of the Week
February's batch of Scripts of the Week highlights some of the month's best content produced by Kagglers on our public datasets. It also includes a great getting started script predicting outcomes of the 2016 NCAA basketball tournaments for March Machine Learning Mania 2016. Actually, I'm quite new to Kaggle, and before entering into the jungle of competitions, I wanted to train on datasets. I believe that training on datasets is a good way to start on Kaggle: there is no deadline, no competition. I chose this dataset because it was typically calling for sentiment analysis, which is a classical exercise for text-mining.
Understanding log_loss - March Machine Learning Mania 2016
I read somewhere on this forum that last years winner had only 18 misclassification's. That made me suspicious way my submitted model is only in 126 place on a leader-board( my current result is 42 True predictions and 14 False ones). It seems that if I would had made my predictions more aggressive, something like: if pred 0.5: pred 0.95 else: pred 0.05 My log_loss would be 0.34, instead of 0.53. I'm curious is this common case to manually edit your probabilities or other classifiers like neural networks ( never tried it) just predicting with higher confidence?
Coulda, Woulda, Shoulda - March Machine Learning Mania 2016
The March Madness competitions have the perfect trifecta for regret: repeated leaderboard feedback, ability to tweak probabilities based on personal biases, and a wealth of different data points. So what would you do differently for next year? The only tweak I did was changing the round 1 probabilities for seed 1s to 100% on one submission. I should have gambled a lot more because my two submissions are so similar. I didn't notice this cause I just saw that the predicted champions were different without realizing all the probabilities are within 2-3% of each other.