Goto

Collaborating Authors

 Education


In South Korea, dreams of fame at the School of Go

The Japan Times

SEOUL – For the past two years, 12-year-old Cho Sung-bin has spent nearly all his waking hours focused on a wooden board covered with black and white stones, honing the skills he hopes to translate into a lucrative career as a professional go player. I never get tired," said Cho, one of dozens of preteens sitting at rows of desks topped with playing boards at the Lee Sedol School of go in central Seoul. Many spend 12 hours a day practicing match play with each other in the small, largely windowless rooms of the school, which is named after the grandmaster they all hope to emulate. Already well known in East Asia, Lee achieved global recognition in March when he took on Google's artificial intelligence AlphaGo program in a five-match showdown. The 33-year-old lost the series, but the battle gave an unprecedented boost to the ancient board game's international profile. Go originated in China 3,000 years ago and has been played for centuries mostly in China, Japan and South Korea, with more than 40 million fans worldwide. Two players take turns placing black or white stones on a square board with a 19-by-19 grid. But the strategies needed to secure victory are complex, with reportedly more possible move configurations than atoms in the universe. "Go is not just an entertainment.


My data science journey

@machinelearnbot

I describe here the projects that I worked on, as well as career progress, starting 25 years ago as a PhD student in statistics, until today, and the transformation from statistician to data scientist that occurred slowly and started more than 20 years ago. This also illustrates many applications of data science, most are still active. My interest in mathematics started when I was 7 or 8, I remember being fascinated by the powers of 2 in primary school, and later purchasing cheap russian math books (Mir publisher) translated in French, for my entertainement. In high school, I participated in the mathematical olympiads, and did my own math research during math classes, rather than listening to the very boring lessons. When I attended college, I stopped showing up in the classroom altogether - afterall, you could just read the syllabus, memorize the material before the exam and regurgitate it at the exam.


The Offset Tree for Learning with Partial Labels

arXiv.org Artificial Intelligence

We present an algorithm, called the Offset Tree, for learning to make decisions in situations where the payoff of only one choice is observed, rather than all choices. The algorithm reduces this setting to binary classification, allowing one to reuse of any existing, fully supervised binary classification algorithm in this partial information setting. We show that the Offset Tree is an optimal reduction to binary classification. In particular, it has regret at most $(k-1)$ times the regret of the binary classifier it uses (where $k$ is the number of choices), and no reduction to binary classification can do better. This reduction is also computationally optimal, both at training and test time, requiring just $O(\log_2 k)$ work to train on an example or make a prediction. Experiments with the Offset Tree show that it generally performs better than several alternative approaches.


How to Set Up Distributed XGBoost on MapR-FS

#artificialintelligence

XGBoost is a library that is designed for boosted (tree) algorithms. It has become a popular machine learning framework among data science practitioners, especially on Kaggle, which is a platform for data prediction competitions where researchers post their data and statisticians and data miners compete to produce the best models. For structured learning problems on Kaggle, it can be difficult to get into the top 10 without including XGBoost. Typically, data scientists use multi-thread single machines to train XGBoost models. Very few people have deployed XGBoost on a distributed environment and achieved good performance.


Intro to Machine Learning Udacity

#artificialintelligence

You'll learn how to start with a question and/or a dataset, and use machine learning to turn them into insights. Naive Bayes: We jump in headfirst, learning perhaps the world's greatest algorithm for classifying text. The ability to generate new features independently and on the fly. Behind any great machine learning project is a great dataset that the algorithm can learn from. We were inspired by a treasure trove of email and financial data from the Enron corporation, which would normally be strictly confidential but became public when the company went bankrupt in a blizzard of fraud.


How to Get Started with Machine Learning in R

#artificialintelligence

R has been the gold standard in applied machine learning for a long time. Surveys show that it is the most popular platform used by professional data scientists. It is also preferred by the best data scientists in the world on the competitive machine learning site Kaggle.com. In this mega Ebook written in the friendly Machine Learning Mastery style that you're used to, learn how to get started, practice and apply machine learning using the R platform. As a developer you know how to pick up a new programming language quickly.


7 Business Schools Exploring EdTech -- From Artificial Intelligence To Oculus Rift

#artificialintelligence

When Moocs burst onto the scene five years ago, many predicted business schools' demise. Wharton professors Christian Terwiesch and Karl Ulrich wrote Moocs are a "Trojan Horse" with the potential to "destroy" the full-time MBA. But rather than killing the campus, they have become an example of the whizzy digital innovations being embraced by even the oldest Ivy League institutions. "You can expect us to take engaged learning to another level where we implement technology. We're already moving in that direction," says Alison Davis-Blake, dean of the University Of Michigan's Ross School of Business. "Online education is one part of it," says Soumitra Dutta, dean of Cornell University's Johnson School of Management.


Software News: Paul Allen doubles down on artificial intelligence research in Seattle

#artificialintelligence

AI is getting bigger and smarter in Seattle. The Allen Institute for Artificial Intelligence, a sister company to Paul Allen's Institute for Brain Science, plans to hire 25 people in the next year as it prepares to take its Aristo technology to the eighth grade, moving on from teaching it fourth-grade science.


Patent Law at the AI Crossroads

#artificialintelligence

Smart robots seem to be everywhere. Whether they're performing surgery, trouncing Go champions or generating dreamy artwork, computers programmed to learn on their own are growing more intelligent by the day. Southwestern Law School professor Ryan Abbott believes that computers are even generating patentable subject matter. We just don't know about it, he says, because disclosing it on an application might render the invention unpatentable. "Now that very large companies like IBM, Pfizer and Google are investing heavily in creative computing, it's going to play a much greater role in innovation in the future," he says.


Avoiding Complexity of Machine Learning Problems

#artificialintelligence

Today, more and more products and engineering teams rely on machine learning (referred to as ML through out this blog post). The abundance of open source tools and libraries also makes it much easier to learn, develop, and build ML models even for people with little prior knowledge or experience. ML is a powerful tool for many problems, but it comes with costs -- it can introduce complexity to systems which builds up over time and evolves into large technical debt. A recent publication by Google argues that it is remarkably easy to incur massive ongoing maintenance costs at the system level when applying ML (see Reference 1). At Quora, we've been using ML to tackle many interesting problems such as ranking, search, recommendation, and spam detection (see Reference 2, 3, and 4).