Overview
A Survey of Current Practice and Teaching of AI
Wollowski, Michael (Rose-Hulman Institute of Technology) | Selkowitz, Robert (Canisius College) | Brown, Laura E. (Michigan Technological Institute) | Goel, Ashok (Georgia Institute of Technology) | Luger, George (University of New Mexico) | Marshall, Jim (Sarah Lawrence College) | Neel, Andrew (Discover Cards) | Neller, Todd (Gettysburg College) | Norvig, Peter (Google)
The field of AI has changed significantly in the past couple of years and will likely continue to do so. Driven by a desire to expose our students to relevant and modern materials, we conducted two surveys, one of AI instructors and one of AI practitioners. The surveys were aimed at gathering infor-mation about the current state of the art of introducing AI as well as gathering input from practitioners in the field on techniques used in practice. In this paper, we present and briefly discuss the responses to those two surveys.
Topic Models to Infer Socio-Economic Maps
Hong, Lingzi (University of Maryland) | Frias-Martinez, Enrique (Telefonica Research) | Frias-Martinez, Vanessa (University of Maryland)
Socio-economic maps contain important information regarding the population of a country. Computing these maps is critical given that policy makers often times make important decisions based upon such information. However, the compilation of socio-economic maps requires extensive resources and becomes highly expensive. On the other hand, the ubiquitous presence of cell phones, is generating large amounts of spatiotemporal data that can reveal human behavioral traits related to specific socio-economic characteristics. Traditional inference approaches have taken advantage of these datasets to infer regional socio-economic characteristics. In this paper, we propose a novel approach whereby topic models are used to infer socio-economic levels from large-scale spatio-temporal data. Instead of using a pre-determined set of features, we use latent Dirichlet Allocation (LDA) to extract latent recurring patterns of co-occurring behaviors across regions, which are then used in the prediction of socio-economic levels. We show that our approach improves state of the art prediction results by 9%.
Large Scale Similarity Learning Using Similar Pairs for Person Verification
Yang, Yang (Institute of Automation, Chinese Academy of Sciences) | Liao, Shengcai (Institute of Automation, Chinese Academy of Sciences) | Lei, Zhen (Institute of Automation, Chinese Academy of Sciences) | Li, Stan Z. (Institute of Automation, Chinese Academy of Sciences)
In this paper, we propose a novel similarity measure and then introduce an efficient strategy to learn it by using only similar pairs for person verification. Unlike existing metric learning methods, we consider both the difference and commonness of an image pair to increase its discriminativeness. Under a pairconstrained Gaussian assumption, we show how to obtain the Gaussian priors (i.e., corresponding covariance matrices) of dissimilar pairs from those of similar pairs. The application of a log likelihood ratio makes the learning process simple and fast and thus scalable to large datasets. Additionally, our method is able to handle heterogeneous data well. Results on the challenging datasets of face verification (LFW and Pub-Fig) and person re-identification (VIPeR) show that our algorithm outperforms the state-of-the-art methods.
Tweet Timeline Generation with Determinantal Point Processes
Yao, Jin-ge (Peking University) | Fan, Feifan (Peking University) | Zhao, Wayne Xin (Renmin University of China) | Wan, Xiaojun (Peking University) | Chang, Edward (HTC Research) | Xiao, Jianguo (Peking University)
The task of tweet timeline generation (TTG) aims at selecting a small set of representative tweets to generate a meaningful timeline and providing enough coverage for a given topical query. This paper presents an approach based on determinantal point processes (DPPs) by jointly modeling the topical relevance of each selected tweet and overall selectional diversity. Aiming at better treatment for balancing relevance and diversity, we introduce two novel strategies, namely spectral rescaling and topical prior. Extensive experiments on the public TREC 2014 dataset demonstrate that our proposed DPP model along with the two strategies can achieve fairly competitive results against the state-of-the-art TTG systems.
Representing Verbs as Argument Concepts
Gong, Yu (Shanghai Jiao Tong University) | Zhao, Kaiqi (Shanghai Jiao Tong University) | Zhu, Kenny Qili (Shanghai Jiao Tong University)
Verbs play an important role in the understanding of natural language text. This paper studies the problem of abstracting the subject and object arguments of a verb into a set of noun concepts, known as the โargument conceptsโ. This set of concepts, whose size is parameterized, represents the fine-grained semantics of a verb. For example, the object of โenjoyโ can be abstracted into time, hobby and event, etc. We present a novel framework to automatically infer human readable and machine computable action concepts with high accuracy.
Learning Step Size Controllers for Robust Neural Network Training
Daniel, Christian (TU Darmstadt) | Taylor, Jonathan (Microsoft Research) | Nowozin, Sebastian (Microsoft Research)
This paper investigates algorithms to automatically adapt the learning rate of neural networks (NNs). Starting with stochastic gradient descent, a large variety of learning methods has been proposed for the NN setting. However, these methods are usually sensitive to the initial learning rate which has to be chosen by the experimenter. We investigate several features and show how an adaptive controller can adjust the learning rate without prior knowledge of the learning problem at hand.
Relaxed Majorization-Minimization for Non-Smooth and Non-Convex Optimization
Xu, Chen (Peking University) | Lin, Zhouchen ( Peking University ) | Zhao, Zhenyu ( National University of Defense Technology ) | Zha, Hongbin ( Peking University )
We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.
Face Behind Makeup
Wang, Shuyang (Northeastern University) | Fu, Yun (Northeastern University)
In this work, we propose a novel automatic makeup detector and remover framework. For makeup detector, a locality-constrained low-rank dictionary learning algorithm is used to determine and locate the usage of cosmetics. For the challenging task of makeup removal, a locality-constrained coupled dictionary learning (LC-CDL) framework is proposed to synthesize non-makeup face, so that the makeup could be erased according to the style. Moreover, we build a stepwise makeup dataset (SMU) which to the best of our knowledge is the first dataset with procedures of makeup. This novel technology itself carries many practical applications, e.g. products recommendation for consumers; user-specified makeup tutorial; security applications on makeup face verification. Finally, our system is evaluated on three existing (VMU, MIW, YMU) and one own-collected makeup datasets. Experimental results have demonstrated the effectiveness of DL-based method on makeup detection. The proposed LC-CDL shows very promising performance on makeup removal regarding on the structure similarity. In addition, the comparison of face verification accuracy with presence or absence of makeup is presented, which illustrates an application of our automatic makeup remover system in the context of face verification with facial makeup.
Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics
Clรฉmenรงon, Stรฉphan, Bellet, Aurรฉlien, Colin, Igor
In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i.e. functionals of the training data with low variance that take the form of averages over $k$-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sample size $n$, as it requires averaging $O(n^d)$ terms. This makes learning procedures relying on the optimization of such data functionals hardly feasible in practice. It is the major goal of this paper to show that, strikingly, such empirical risks can be replaced by drastically computationally simpler Monte-Carlo estimates based on $O(n)$ terms only, usually referred to as incomplete $U$-statistics, without damaging the $O_{\mathbb{P}}(1/\sqrt{n})$ learning rate of Empirical Risk Minimization (ERM) procedures. For this purpose, we establish uniform deviation results describing the error made when approximating a $U$-process by its incomplete version under appropriate complexity assumptions. Extensions to model selection, fast rate situations and various sampling techniques are also considered, as well as an application to stochastic gradient descent for ERM. Finally, numerical examples are displayed in order to provide strong empirical evidence that the approach we promote largely surpasses more naive subsampling techniques.
Here's Facebook's vision for the future of AI
GettyFacebook CEO Mark Zuckerberg's 2016 New Years resolution is to create a virtual assistant for his home. Facebook is investing heavily in what many in the tech industry believe to be the next frontier of innovation, artificial intelligence. The largest social network on earth has a division of AI experts it calls FAIR. There's also a separate team called Applied Machine Learning, which focuses on "giving people communication superpowers through AI." Facebook clearly believes that AI is important to the company's future. Its employees are running 50x more AI experiments per day compared to last year.