AITopics

Bayesian inference is an appealing approach for leveraging prior knowledge in reinforcement learning (RL). In this paper we describe an algorithm for discovering different classes of roles for agents via Bayesian inference. In particular, we develop a Bayesian policy search approach for Multi-Agent RL (MARL), which is model-free and allows for priors on policy parameters. We present a novel optimization algorithm based on hybrid MCMC, which leverages both the prior and gradient information estimated from trajectories. Our experiments in a complex real-time strategy game demonstrate the effective discovery of roles from supervised trajectories, the use of discovered roles for successful transfer to similar tasks, and the discovery of roles through reinforcement learning.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures

Porteous, Ian (University of California Irvine) | Asuncion, Arthur (University of California Irvine) | Welling, Max (University of California Irvine)

Matrix factorization is a fundamental technique in machine learning that is applicable to collaborative filtering, information retrieval and many other areas. In collaborative filtering and many other tasks, the objective is to fill in missing elements of a sparse data matrix. One of the biggest challenges in this case is filling in a column or row of the matrix with very few observations. In this paper we introduce a Bayesian matrix factorization model that performs regression against side information known about the data in addition to the observations. The side information helps by adding observed entries to the factored matrices. We also introduce a nonparametric mixture model for the prior of the rows and columns of the factored matrices that gives a different regularization for each latent class. Besides providing a richer prior, the posterior distribution of mixture assignments reveals the latent classes. Using Gibbs sampling for inference, we apply our model to the Netflix Prize problem of predicting movie ratings given an incomplete user-movie ratings matrix. Incorporating rating information with gathered metadata information, our Bayesian approach outperforms other matrix factorization techniques even when using fewer dimensions.

artificial intelligence, information, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Orange County > Irvine (0.14)
South America > Paraguay > Asunción > Asunción (0.05)
North America > United States > New York > New York County > New York City (0.05)
(2 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Media > Television (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Cost-Sensitive Semi-Supervised Support Vector Machine

Li, Yu-Feng (Nanjing University, China) | Kwok, James T. (Hong Kong University of Science and Technology) | Zhou, Zhi-Hua (Nanjing University, China)

In this paper, we study cost-sensitive semi-supervised learning where many of the training examples are unlabeled and different misclassification errors are associated with unequal costs. This scenario occurs in many real-world applications. For example, in some disease diagnosis, the cost of erroneously diagnosing a patient as healthy is much higher than that of diagnosing a healthy person as a patient. Also, the acquisition of labeled data requires medical diagnosis which is expensive, while the collection of unlabeled data such as basic health information is much cheaper. We propose the CS4VM (Cost-Sensitive Semi-Supervised Support Vector Machine) to address this problem. We show that the CS4VM, when given the label means of the unlabeled data, closely approximates the supervised cost-sensitive SVM that has access to the ground-truth labels of all the unlabeled data. This observation leads to an efficient algorithm which first estimates the label means and then trains the CS4VM with the plug-in label means by an efficient SVM solver. Experiments on a broad range of data sets show that the proposed method is capable of reducing the total cost and is computationally efficient.

artificial intelligence, machine learning, unlabeled data, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Health & Medicine > Diagnostic Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Lahiri, Mayank (University of Illinois at Chicago) | Cebrian, Manuel (Massachusetts Institute of Technology)

The Genetic Algorithm as a General Diffusion Model for Social Networks

Diffusion processes taking place in social networks are used to model a number of phenomena, such as the spread of human or computer viruses, and the adoption of products in viral marketing campaigns. It is generally difficult to obtain accurate information about how such spreads actually occur, so a variety of stochastic diffusion models are used to simulate spreading processes in networks instead. We show that a canonical genetic algorithm with a spatially distributed population, when paired with specific forms of Holland's synthetic hyperplane-defined objective functions, can simulate a large and rich class of diffusion models for social networks. These include standard diffusion models, such as the Independent Cascade and Competing Processes models. In addition, our Genetic Algorithm Diffusion Model (GADM) can also model complex phenomena such as information diffusion. We demonstrate an application of the GADM to modeling information flow in a large, dynamic social network derived from e-mail headers.

evolutionary algorithm, machine learning, vertex, (17 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.92)

Reinforcement Learning Via Practice and Critique Advice

Judah, Kshitij (Oregon State University) | Roy, Saikat (Oregon State University) | Fern, Alan (Oregon State University) | Dietterich, Thomas G. (Oregon State University)

We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on actual world experience, and end-user critique sessions where advice is gathered. During each critique session the end-user is allowed to analyze a trajectory of the current policy and then label an arbitrary subset of the available actions as good or bad. Our main contribution is an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy. The approach optimizes a loss function that linearly combines losses measured against the world experience and the critique data. We evaluate our approach using a prototype system for teaching tactical battle behavior in a real-time strategy game engine. Results are given for a significant evaluation involving ten end-users showing the promise of this approach and also highlighting challenges involved in inserting end-users into the RL loop.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: North America > United States > Oregon > Benton County > Corvallis (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Leisure & Entertainment > Games > Computer Games (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Facial Age Estimation by Learning from Label Distributions

Geng, Xin (Monash University) | Smith-Miles, Kate (Monash University) | Zhou, Zhi-Hua (Nanjing University)

One of the main difficulties in facial age estimation is the lack of sufficient training data for many ages. Fortunately, the faces at close ages look similar since aging is a slow and smooth process. Inspired by this observation, in this paper, instead of considering each face image as an example with one label (age), we regard each face image as an example associated with a label distribution. The label distribution covers a number of class labels, representing the degree that each label describes the example. Through this way, in addition to the real age, one face image can also contribute to the learning of its adjacent ages. We propose an algorithm named IIS-LLD for learning from the label distributions, which is an iterative optimization process based on the maximum entropy model. Experimental results show the advantages of IIS-LLD over the traditional learning methods based on single-labeled data.

artificial intelligence, label distribution, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Oceania > Australia (0.04)
North America > United States > Massachusetts > Middlesex County > Malden (0.04)

Genre: Research Report (0.34)

Industry: Education (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.76)

Myopic Policies for Budgeted Optimization with Constrained Experiments

Motivated by a real-world problem, we study a novel budgeted optimization problem where the goal is to optimize an unknown function f ( x ) given a budget. In our setting, it is not practical to request samples of f ( x ) at precise input values due to the formidable cost of precise experimental setup. Rather, we may request a constrained experiment, which is a subset r of the input space for which the experimenter returns x in r and f ( x ). Importantly, as the constraints become looser, the experimental cost decreases, but the uncertainty about the location x of the next observation increases. Our goal is to manage this trade-off by selecting a sequence of constrained experiments to best optimize f within the budget. We introduce cost-sensitive policies for selecting constrained experiments using both model-free and model-based approaches, inspired by policies for unconstrained settings. Experiments on synthetic functions and functions derived from real-world experimental data indicate that our policies outperform random selection, that the model-based policies are superior to model-free ones, and give insights into which policies are preferable overall.

artificial intelligence, experiment, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Energy > Renewable > Hydrogen (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Latent Variable Model for Learning in Pairwise Markov Networks

Amizadeh, Saeed (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)

Pairwise Markov Networks (PMN) are an important class of Markov networks which, due to their simplicity, are widely used in many applications such as image analysis, bioinformatics, sensor networks, etc. However, learning of Markov networks from data is a challenging task; there are many possible structures one must consider and each of these structures comes with its own parameters making it easy to overfit the model with limited data. To deal with the problem, recent learning methods build upon the L1 regularization to express the bias towards sparse network structures. In this paper, we propose a new and more flexible framework that let us bias the structure, that can, for example, encode the preference to networks with certain local substructures which as a whole exhibit some special global structure. We experiment with and show the benefit of our framework on two types of problems: learning of modular networks and learning of traffic networks models.

artificial intelligence, energy function, machine learning, (14 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Industry: Transportation (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Collaborative Filtering Meets Mobile Recommendation: A User-Centered Approach

Zheng, Vincent W. (Hong Kong University of Science and Technology) | Cao, Bin (Hong Kong University of Science and Technology) | Zheng, Yu (Microsoft Research Asia) | Xie, Xing (Microsoft Research Asia) | Yang, Qiang (Hong Kong University of Science and Technology)

With the increasing popularity of location tracking services such as GPS, more and more mobile data are being accumulated. Based on such data, a potentially useful service is to make timely and targeted recommendations for users on places where they might be interested to go and activities that they are likely to conduct. For example, a user arriving in Beijing might wonder where to visit and what she can do around the Forbidden City. A key challenge for such recommendation problems is that the data we have on each individual user might be very limited, while to make useful and accurate recommendations, we need extensive annotated location and activity information from user trace data. In this paper, we present a new approach, known as user-centered collaborative location and activity filtering (UCLAF), to pull many users’ data together and apply collaborative filtering to find like-minded users and like-patterned activities at different locations. We model the userlocation- activity relations with a tensor representation, and propose a regularized tensor and matrix decomposition solution which can better address the sparse data problem in mobile information retrieval. We empirically evaluate UCLAF using a real-world GPS dataset collected from 164 users over 2.5 years, and showed that our system can outperform several state-of-the-art solutions to the problem.

artificial intelligence, machine learning, recommendation, (17 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Beijing > Beijing (0.25)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.48)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Coalition Structure Generation based on Distributed Constraint Optimization

Forming effective coalitions is a major research challenge in AI and multi-agent systems (MAS). Coalition Structure generation (CSG) involves partitioning a set of agents into coalitions so that social surplus (the sum of the rewards of all coalitions) is maximized. A partition is called a Coalition Structure (CS). In traditional works, the value of a coalition is given by a black box function called a characteristic function. In this paper, we propose a novel formalization of CSG, i.e., we assume the value of a characteristic function is given by an optimal solution of a distributed constraint optimization problem (DCOP) among the agents of a coalition. A DCOP is a popular approach for modeling cooperative agents, since it is quite general and can formalize various application problems in MAS. At first glance, one might assume that the computational costs required in this approach would be too expensive, since we need to solve an NP-hard problem just to obtain the value of a single coalition. To optimally solve a CSG, we might need to solve n-th power of 2 DCOP problem instances, where n is the number of agents. However, quite surprisingly, we show that an approximation algorithm, whose computational cost is about the same as solving just one DCOP, can find a CS with quality guarantees. More specifically, we develop an algorithm with parameter k that can find a CS whose social surplus is at least max(k/(w*+1), 2k/n) of the optimal CS, where w* is the tree width of a constraint graph. When k=1, the complexity of this algorithm is about the same as solving just one DCOP. These results illustrate that the locality of interactions among agents, which is explicitly modeled in the DCOP formalization, is quite useful in developing an efficient CSG algorithm with quality guarantees.

algorithm, artificial intelligence, coalition, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.04)

Industry: Transportation (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)