Goto

Collaborating Authors

 Markov Models


Feature-Level Domain Adaptation

arXiv.org Machine Learning

Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and studies an approach, called feature-level domain adaptation (flda), that models the dependence between the two domains by means of a feature-level transfer model that is trained to describe the transfer from source to target domain. Subsequently, we train a domain-adapted classifier by minimizing the expected loss under the resulting transfer model. For linear classifiers and a large family of loss functions and transfer models, this expected loss can be comp uted or approximated analytically, and minimized efficiently. Our empirical evaluation of flda focuses on problems comprising binary and count data in which the transfer can be naturally modeled via a dropout distribution, which allows the classifier to adapt to differences in the marginal probability of features in the source and the target domain. Our experiments on several real-world problems show that flda performs on par with state-of-the-art domain-adaptation techniques. Keywords: Domain adaptation, transfer learning, sample selection bias, covariate shift, empirical risk minimization, dropout.


Machine Learning is dead โ€“ Long live machine learning!

#artificialintelligence

You may be thinking that this title makes no sense at all. ML, AI, ANN and Deep learning have made it into the everyday lexicon and here I am, proclaiming that ML is dead. The open sourcing of entire ML frameworks marks the end of a phase of rapid development of tools, and thus marks the death of ML as we have known it so far. The next phase will be marked with ubiquitous application of these tools into software applications. And that is how ML will live forever, because it will seamlessly and inextricably integrate into our lives. There has been a rapid democratization of data and tools in the past year.


Dual Formulations for Optimizing Dec-POMDP Controllers

AAAI Conferences

Decentralized POMDP is an expressive model for multi-agent planning. Finite-state controllers (FSCs)---often used to represent policies for infinite-horizon problems---offer a compact, simple-to-execute policy representation. We exploit novel connections between optimizing decentralized FSCs and the dual linear program for MDPs. Consequently, we describe a dual mixed integer linear program (MIP) for optimizing deterministic FSCs. We exploit the Dec-POMDP structure to devise a compact MIP and formulate constraints that result in policies executable in partially-observable decentralized settings. We show analytically that the dual formulation can also be exploited within the expectation maximization (EM) framework to optimize stochastic FSCs. The resulting EM algorithm can be implemented by solving a sequence of linear programs, without requiring expensive message-passing over the Dec-POMDP DBN. We also present an efficient technique for policy improvement based on a weighted entropy measure. Compared with state-of-the-art FSC methods, our approach offers over an order-of-magnitude speedup, while producing similar or better solutions.


Indefinite-Horizon Reachability in Goal-DEC-POMDPs

AAAI Conferences

DEC-POMDPs extend POMDPs to a multi-agent setting, where several agents operate in an uncertain environment independently to achieve a joint objective. DEC-POMDPs have been studied with finite-horizon and infinite-horizon discounted-sum objectives, and there exist solvers both for exact and approximate solutions. In this work we consider Goal-DEC-POMDPs, where given a set of target states, the objective is to ensure that the target set is reached with minimal cost.We consider the indefinite-horizon (infinite-horizon with either discounted-sum, or undiscounted-sum, where absorbing goal states have zero-cost) problem. We present a new and novel method to solve the problem that extends methods for finite-horizon DEC-POMDPs and the RTDP-Bel approach for POMDPs. We present experimental results on several examples, and show that our approach presents promising results.


A PAC RL Algorithm for Episodic POMDPs

arXiv.org Machine Learning

Many interesting real world domains involve reinforcement learning (RL) in partially observable environments. Efficient learning in such domains is important, but existing sample complexity bounds for partially observable RL are at least exponential in the episode length. We give, to our knowledge, the first partially observable RL algorithm with a polynomial bound on the number of episodes on which the algorithm may not achieve near-optimal performance. Our algorithm is suitable for an important class of episodic POMDPs. Our approach builds on recent advances in method of moments for latent variable model estimation.


Quantifying the probable approximation error of probabilistic inference programs

arXiv.org Machine Learning

This paper introduces a new technique for quantifying the approximation error of a broad class of probabilistic inference programs, including ones based on both variational and Monte Carlo approaches. The key idea is to derive a subjective bound on the symmetrized KL divergence between the distribution achieved by an approximate inference program and its true target distribution. The bound's validity (and subjectivity) rests on the accuracy of two auxiliary probabilistic programs: (i) a "reference" inference program that defines a gold standard of accuracy and (ii) a "meta-inference" program that answers the question "what internal random choices did the original approximate inference program probably make given that it produced a particular result?" The paper includes empirical results on inference problems drawn from linear regression, Dirichlet process mixture modeling, HMMs, and Bayesian networks. The experiments show that the technique is robust to the quality of the reference inference program and that it can detect implementation bugs that are not apparent from predictive performance.


Interacting with Machine Learning โ€“ Here is Why You Should Care

#artificialintelligence

For common readers or for experts, the topic of machine learning is one that more often than not brings up lengthy heated discussions, with eyes turning and heads shaking in disagreement. No wonder why... Mounds of private information are being collected by giant corporations, stored in private data silos, and exposed to us only through creepy and yet insightful automated recommendations and suggestions. Like it or not, machine learning has entered our lives boldly and is here to stay. In the voice of Siri, in our search engines, in systems that protect us from frauds and intrusions, in applications that understand our emotions, and the list goes on and onโ€ฆ These days, my phone auto completes almost all information about my new contacts and meetings. I can almost feel a growing discomfort with that thought and I know I'm not alone.


Budgeted Optimization with Constrained Experiments

Journal of Artificial Intelligence Research

Motivated by a real-world problem, we study a novel budgeted optimization problem where the goal is to optimize an unknown function f(.) given a budget by requesting a sequence of samples from the function. In our setting, however, evaluating the function at precisely specified points is not practically possible due to prohibitive costs. Instead, we can only request constrained experiments. A constrained experiment, denoted by Q, specifies a subset of the input space for the experimenter to sample the function from. The outcome of Q includes a sampled experiment x, and its function output f(x). Importantly, as the constraints of Q become looser, the cost of fulfilling the request decreases, but the uncertainty about the location x increases. Our goal is to manage this trade-off by selecting a set of constrained experiments that best optimize f(.) within the budget. We study this problem in two different settings, the non-sequential (or batch) setting where a set of constrained experiments is selected at once, and the sequential setting where experiments are selected one at a time. We evaluate our proposed methods for both settings using synthetic and real functions. The experimental results demonstrate the efficacy of the proposed methods.


A Neural Autoregressive Approach to Collaborative Filtering

arXiv.org Machine Learning

This paper proposes CF-NADE, a neural autoregressive architecture for collaborative filtering (CF) tasks, which is inspired by the Restricted Boltzmann Machine (RBM) based CF model and the Neural Autoregressive Distribution Estimator (NADE). We first describe the basic CF-NADE model for CF tasks. Then we propose to improve the model by sharing parameters between different ratings. A factored version of CF-NADE is also proposed for better scalability. Furthermore, we take the ordinal nature of the preferences into consideration and propose an ordinal cost to optimize CF-NADE, which shows superior performance. Finally, CF-NADE can be extended to a deep model, with only moderately increased computational complexity. Experimental results show that CF-NADE with a single hidden layer beats all previous state-of-the-art methods on MovieLens 1M, MovieLens 10M, and Netflix datasets, and adding more hidden layers can further improve the performance.


Reinforcement Learning of POMDPs using Spectral Methods

arXiv.org Artificial Intelligence

We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through episodes, in each episode we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the episode, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces.