Goto

Collaborating Authors

 Education


Transfer Learning - Machine Learning's Next Frontier

#artificialintelligence

In recent years, we have become increasingly good at training deep neural networks to learn a very accurate mapping from inputs to outputs, whether they are images, sentences, label predictions, etc. from large amounts of labeled data. What our models still frightfully lack is the ability to generalize to conditions that are different from the ones encountered during training. Every time you apply your model not to a carefully constructed dataset but to the real world. The real world is messy and contains an infinite number of novel scenarios, many of which your model has not encountered during training and for which it is in turn ill-prepared to make predictions. The ability to transfer knowledge to new conditions is generally known as transfer learning and is what we will discuss in the rest of this post. Over the course of this blog post, I will first contrast transfer learning with machine learning's most pervasive and successful paradigm, supervised learning. I will then outline reasons why transfer learning warrants our attention. Subsequently, I will give a more technical definition and detail different transfer learning scenarios. I will then provide examples of applications of transfer learning before delving into practical methods that can be used to transfer knowledge.


The AI Used To Sell You More Stuff Can Now Read Better Than A Human

#artificialintelligence

For the first time ever, two AI systems built to process and respond to human speech (created, respectively, by Microsoft and Chinese commerce giant Alibaba) outscored humans in a reading comprehension test designed by Stanford researchers. The Stanford Question Answering Dataset, SQuAD, is composed of a staggering 100,000 questions following brief reading passages. Created in 2016, SQuAD is used as a benchmark to measure AI's progress in natural language processing. After reading excerpts from Wikipedia, the systems answer questions such as "What is the Latin name for Black Death?" and "How many actors have played Doctor Who?" Alibaba's AI score was 82.44, and Microsoft's was 82.650, with humans trailing behind them both at 82.304. Alibaba's system may have finished second, but it's more than qualified to handle its day job: Working in sales. The company's AI team reportedly works closely with the developers of Ali Xiaomi, a chat bot that answers customer questions about products.


Learning Path: Data Science With Apache Spark 2

@machinelearnbot

The real power and value proposition of Apache Spark is its speed and platform to execute data processing and data science tasks. Let's see how easy it is! Packt's Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it. Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.


AI models beat humans at reading comprehension, but they've still got a ways to go

@machinelearnbot

When computer models designed by tech giants Alibaba and Microsoft this month surpassed humans for the first time in a reading-comprehension test, both companies celebrated the success as a historic milestone. Luo Si, the chief scientist for natural-language processing at Alibaba's AI research unit, struck a poetic note, saying, "Objective questions such as'what causes rain' can now be answered with high accuracy by machines." Teaching a computer to read has for decades been one of artificial intelligence's holiest grails, and the feat seemed to signal a coming future in which AI could understand words and process meaning with the same fluidity humans take for granted every day. But computers aren't there yet -- and aren't even really that close, said AI experts who reviewed the test results. Instead, the accomplishment highlights not just how far the technology has progressed, but also how far it still has to go. "It's a large step" for the companies' marketing "but a small step for humankind," said Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligence, an AI research group funded by Microsoft co-founder Paul Allen.


Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

arXiv.org Machine Learning

We present an efficient second-order algorithm with $\tilde{O}(\frac{1}{\eta}\sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, for a range of $\eta$ restricted by the norm of the competitor. The family of loss functions ranges from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (J. Abernethy and A. Rakhlin. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it also performs favorably against earlier algorithms.


MORF: A Framework for MOOC Predictive Modeling and Replication At Scale

arXiv.org Machine Learning

The MOOC Replication Framework (MORF) is a novel software system for feature extraction, model training/testing, and evaluation of predictive dropout models in Massive Open Online Courses (MOOCs). MORF makes large-scale replication of complex machine-learned models tractable and accessible for researchers, and enables public research on privacy-protected data. It does so by focusing on the high-level operations of an extract-train-test-evaluate workflow, and enables researchers to encapsulate their implementations in portable, fully reproducible software containers which are executed on data with a known schema. MORF's workflow allows researchers to use data in analysis without providing them access to the underlying data directly, preserving privacy and data security. During execution, containers are sandboxed for security and data leakage and parallelized for efficiency, allowing researchers to create and test new models rapidly, on large-scale multi-institutional datasets that were previously inaccessible to most researchers. MORF is provided both as a Python API (the MORF Software), for institutions to use on their own MOOC data) or in a platform-as-a-service (PaaS) model with a web API and a high-performance computing environment (the MORF Platform).


Deep Learning: An Introduction for Applied Mathematicians

arXiv.org Machine Learning

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final year undergraduate students in mathematics who are keen to learn about the area. The article may also be useful for instructors in mathematics who wish to enliven their classes with references to the application of deep learning techniques. We focus on three fundamental questions: what is a deep neural network? how is a network trained? what is the stochastic gradient method? We illustrate the ideas with a short MATLAB code that sets up and trains a network. We also show the use of state-of-the art software on a large scale image classification problem. We finish with references to the current literature.


Ranking Data with Continuous Labels through Oriented Recursive Partitions

arXiv.org Machine Learning

We formulate a supervised learning problem, referred to as continuous ranking, where a continuous real-valued label Y is assigned to an observable r.v. X taking its values in a feature space $\mathcal{X}$ and the goal is to order all possible observations x in $\mathcal{X}$ by means of a scoring function $s:\mathcal{X}\rightarrow \mathbb{R}$ so that s(X) and Y tend to increase or decrease together with highest probability. This problem generalizes bi/multi-partite ranking to a certain extent and the task of finding optimal scoring functions s(x) can be naturally cast as optimization of a dedicated functional criterion, called the IROC curve here, or as maximization of the Kendall ${\tau}$ related to the pair (s(X), Y ). From the theoretical side, we describe the optimal elements of this problem and provide statistical guarantees for empirical Kendall ${\tau}$ maximization under appropriate conditions for the class of scoring function candidates. We also propose a recursive statistical learning algorithm tailored to empirical IROC curve optimization and producing a piecewise constant scoring function that is fully described by an oriented binary tree. Preliminary numerical experiments highlight the difference in nature between regression and continuous ranking and provide strong empirical evidence of the performance of empirical optimizers of the criteria proposed.


Vehicle Detection and Tracking – Towards Data Science

@machinelearnbot

This is the Udacity's Self-Driving Car Engineer Nanodegree Program final project for the 1st Term. To write a software pipeline to identify vehicles in a video from a front-facing camera on a car. In my implementation, I used a Deep Learning approach to image recognition. Specifically, I leveraged the extraordinary power of Convolutional Neural Networks (CNNs) to recognize images. However, the task at hand is not just to detect a vehicle's presence, but rather to point to its location. It turns out CNNs are suitable for these type of problems as well.


Lane Detection with Deep Learning (Part 2) – Towards Data Science

@machinelearnbot

This is part two of my deep learning solution for lane detection, which covers the actual models I created in finding my final approach to the problem, as well as some potential improvements. Be sure to read Part One for the limitations of my previous approaches as well as the preliminary data used prior to the changes I made below. The code and data mentioned here and in the earlier post can be found in my Github repo. With a decent dataset created, I was ready to make my first model for using deep learning to detect lane lines. You may be asking, "Wait, I thought you were trying to get rid of perspective transformation?"