AITopics | Education

Collaborating Authors

Education

Online learning in MDPs with side information

arXiv.org Machine LearningJun-26-2014

We study online learning of finite Markov decision process (MDP) problems when a side information vector is available. The problem is motivated by applications such as clinical trials, recommendation systems, etc. Such applications have an episodic structure, where each episode corresponds to a patient/customer. Our objective is to compete with the optimal dynamic policy that can take side information into account. We propose a computationally efficient algorithm and show that its regret is at most $O(\sqrt{T})$, where $T$ is the number of rounds. To best of our knowledge, this is the first regret bound for this setting.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

1406.6812

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Education > Educational Setting > Online (0.62)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Reinforcement and Imitation Learning via Interactive No-Regret Learning

Ross, Stephane, Bagnell, J. Andrew

arXiv.org Machine LearningJun-23-2014

Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning. These approaches to imitation learning, however, neither require nor benefit from information about the cost of actions. We extend existing results in two directions: first, we develop an interactive imitation learning approach that leverages cost information; second, we extend the technique to address reinforcement learning. The results provide theoretical support to the commonly observed successes of online approximate policy iteration. Our approach suggests a broad new family of algorithms and provides a unifying view of existing techniques for imitation and reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1406.5979

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Education (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Structured Generative Models of Natural Source Code

Maddison, Chris J., Tarlow, Daniel

arXiv.org Machine LearningJun-20-2014

We study the problem of building generative models of natural source code (NSC); that is, source code written and understood by humans. Our primary contribution is to describe a family of generative models for NSC that have three key properties: First, they incorporate both sequential and hierarchical structure. Second, we learn a distributed representation of source code elements. Finally, they integrate closely with a compiler, which allows leveraging compiler logic and abstractions when building structure into the model. We also develop an extension that includes more complex structure, refining how the model generates identifier tokens based on what variables are currently in scope. Our models can be learned efficiently, and we show empirically that including appropriate structure greatly improves the models, measured by the probability of generating test programs.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1401.0514

Country:

North America (0.46)
Asia (0.28)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.83)
(2 more...)

Add feedback

HC-Search: A Learning Framework for Search-based Structured Prediction

Doppa, J.R., Fern, A., Tadepalli, P.

Journal of Artificial Intelligence ResearchJun-19-2014

Structured prediction is the problem of learning a function that maps structured inputs to structured outputs. Prototypical examples of structured prediction include part-of-speech tagging and semantic segmentation of images. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then employs a separate learned cost function C to select a final prediction among those outputs. The overall loss of this prediction architecture decomposes into the loss due to H not leading to high quality outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decomposition, we minimize the overall loss in a greedy stage-wise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H according to their true losses. Importantly, this training procedure is sensitive to the particular loss function of interest and the time-bound allowed for predictions. Experiments on several benchmark domains show that our approach significantly outperforms several state-of-the-art methods.

cost function, hc-search, search space, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4212

AI Access Foundation

10887

Journal of Artificial Intelligence Research

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Oregon > Benton County > Corvallis (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Freeze-Thaw Bayesian Optimization

Swersky, Kevin, Snoek, Jasper, Adams, Ryan Prescott

arXiv.org Machine LearningJun-15-2014

In this paper we develop a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings. Our method uses the partial information gained during the training of a machine learning model in order to decide whether to pause training and start a new model, or resume the training of a previously-considered model. We specifically tailor our method to machine learning problems by developing a novel positive-definite covariance kernel to capture a variety of training curves. Furthermore, we develop a Gaussian process prior that scales gracefully with additional temporal observations. Finally, we provide an information-theoretic framework to automate the decision process. Experiments on several common machine learning models show that our approach is extremely effective in practice.

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Machine Learning

1406.3896

Country:

North America > Canada > Alberta (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Orabona, Francesco

arXiv.org Machine LearningJun-15-2014

Stochastic gradient descent algorithms for training linear and kernel predictors are gaining more and more importance, thanks to their scalability. While various methods have been proposed to speed up their convergence, the model selection phase is often ignored. In fact, in theoretical works most of the time assumptions are made, for example, on the prior knowledge of the norm of the optimal solution, while in the practical world validation methods remain the only viable approach. In this paper, we propose a new kernel-based stochastic gradient descent algorithm that performs model selection while training, with no parameters to tune, nor any form of cross-validation. The algorithm builds on recent advancement in online learning theory for unconstrained settings, to estimate over time the right regularization in a data-dependent way. Optimal rates of convergence are proved under standard smoothness assumptions on the target function, using the range space of the fractional integral operator associated with the kernel.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1406.3816

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Add feedback

Layered Logic Classifiers: Exploring the `And' and `Or' Relations

Tu, Zhuowen, Dollar, Piotr, Wu, Yingnian

arXiv.org Machine LearningMay-27-2014

Designing effective and efficient classifier for pattern analysis is a key problem in machine learning and computer vision. Many the solutions to the problem require to perform logic operations such as `and', `or', and `not'. Classification and regression tree (CART) include these operations explicitly. Other methods such as neural networks, SVM, and boosting learn/compute a weighted sum on features (weak classifiers), which weakly perform the 'and' and 'or' operations. However, it is hard for these classifiers to deal with the 'xor' pattern directly. In this paper, we propose layered logic classifiers for patterns of complicated distributions by combining the `and', `or', and `not' operations. The proposed algorithm is very general and easy to implement. We test the classifiers on several typical datasets from the Irvine repository and two challenging vision applications, object segmentation and pedestrian detection. We observe significant improvements on all the datasets over the widely used decision stump based AdaBoost algorithm. The resulting classifiers have much less training complexity than decision tree based AdaBoost, and can be applied in a wide range of domains.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Machine Learning

1405.6804

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Lectures on Jacques Herbrand as a Logician

Wirth, Claus-Peter, Siekmann, Joerg, Benzmueller, Christoph, Autexier, Serge

arXiv.org Artificial IntelligenceMay-27-2014

We give some lectures on the work on formal logic of Jacques Herbrand, and sketch his life and his influence on automated theorem proving. The intended audience ranges from students interested in logic over historians to logicians. Besides the well-known correction of Herbrand's False Lemma by Goedel and Dreben, we also present the hardly known unpublished correction of Heijenoort and its consequences on Herbrand's Modus Ponens Elimination. Besides Herbrand's Fundamental Theorem and its relation to the Loewenheim-Skolem-Theorem, we carefully investigate Herbrand's notion of intuitionism in connection with his notion of falsehood in an infinite domain. We sketch Herbrand's two proofs of the consistency of arithmetic and his notion of a recursive function, and last but not least, present the correct original text of his unification algorithm with a new translation.

artificial intelligence, herbrand, logic & formal reasoning, (18 more...)

arXiv.org Artificial Intelligence

0902.4682

Country:

Europe > Germany (1.00)
North America > United States > California (0.27)
North America > United States > Massachusetts (0.27)

Genre:

Research Report (0.81)
Summary/Review (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

Fast and Robust Archetypal Analysis for Representation Learning

Chen, Yuansi, Mairal, Julien, Harchaoui, Zaid

arXiv.org Machine LearningMay-26-2014

We revisit a pioneer unsupervised learning technique called archetypal analysis, which is related to successful data analysis methods such as sparse coding and non-negative matrix factorization. Since it was proposed, archetypal analysis did not gain a lot of popularity even though it produces more interpretable models than other alternatives. Because no efficient implementation has ever been made publicly available, its application to important scientific problems may have been severely limited. Our goal is to bring back into favour archetypal analysis. We propose a fast optimization scheme using an active-set strategy, and provide an efficient open-source implementation interfaced with Matlab, R, and Python. Then, we demonstrate the usefulness of archetypal analysis for computer vision tasks, such as codebook learning, signal classification, and large image collection visualization.

archetypal analysis, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1405.6472

Country: North America > United States (0.47)

Genre: Research Report (0.65)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

New Algorithms for Learning Incoherent and Overcomplete Dictionaries

Arora, Sanjeev, Ge, Rong, Moitra, Ankur

arXiv.org Machine LearningMay-26-2014

In sparse recovery we are given a matrix $A$ (the dictionary) and a vector of the form $A X$ where $X$ is sparse, and the goal is to recover $X$. This is a central notion in signal processing, statistics and machine learning. But in applications such as sparse coding, edge detection, compression and super resolution, the dictionary $A$ is unknown and has to be learned from random examples of the form $Y = AX$ where $X$ is drawn from an appropriate distribution --- this is the dictionary learning problem. In most settings, $A$ is overcomplete: it has more columns than rows. This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman, Wang and Wright who gave an algorithm for the full-rank case, which is rarely the case in applications. Our algorithm applies to incoherent dictionaries which have been a central object of study since they were introduced in seminal work of Donoho and Huo. In particular, a dictionary is $\mu$-incoherent if each pair of columns has inner product at most $\mu / \sqrt{n}$. The algorithm makes natural stochastic assumptions about the unknown sparse vector $X$, which can contain $k \leq c \min(\sqrt{n}/\mu \log n, m^{1/2 -\eta})$ non-zero entries (for any $\eta > 0$). This is close to the best $k$ allowable by the best sparse recovery algorithms even if one knows the dictionary $A$ exactly. Moreover, both the running time and sample complexity depend on $\log 1/\epsilon$, where $\epsilon$ is the target accuracy, and so our algorithms converge very quickly to the true dictionary. Our algorithm can also tolerate substantial amounts of noise provided it is incoherent with respect to the dictionary (e.g., Gaussian). In the noisy setting, our running time and sample complexity depend polynomially on $1/\epsilon$, and this is necessary.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1308.6273

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback