AITopics

We propose a method that involves a probabilistic model for learning future classifiers for tasks in which decision boundaries nonlinearly change over time. In certain applications, such as spam-mail classification, the decision boundary dynamically changes over time. Accordingly, the performance of the classifiers will deteriorate quickly unless the classifiers are updated using additional data. However, collecting such data can be expensive or impossible. The proposed model alleviates this deterioration in performance without additional data by modeling the non-linear dynamics of the decision boundary using Gaussian processes. The method also involves our developed learning algorithm for our model based on empirical variational Bayesian inference by which uncertainty of dynamics can be incorporated for future classification. The effectiveness of the proposed method was demonstrated through experiments using synthetic and real-world data sets.

artificial intelligence, classifier, machine learning, (20 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)

Continuous Conditional Dependency Network for Structured Regression

Han, Chao (Temple University) | Ghalwash, Mohamed (IBM T.J. Watson and Temple University) | Obradovic, Zoran (Temple University)

Structured regression on graphs aims to predict response variables from multiple nodes by discovering and exploiting the dependency structure among response variables. This problem is challenging since dependencies among response variables are always unknown, and the associated prior knowledge is non-symmetric. In previous studies, various promising solutions were proposed to improve structured regression by utilizing symmetric prior knowledge, learning sparse dependency structure among response variables, or learning representations of attributes of multiple nodes. However, none of them are capable of efficiently learning dependency structure while incorporating non-symmetric prior knowledge. To achieve these objectives, we proposed Continuous Conditional Dependency Network (CCDN) for structured regression. The intuitive idea behind this model is that each response variable is not only dependent on attributes from the same node, but also on response variables from all other nodes. This results in a joint modeling of local conditional probabilities. The parameter learning is formulated as a convex optimization problem and an effective sampling algorithm is proposed for inference. CCDN is flexible in absorbing non-symmetric prior knowledge. The performance of CCDN on multiple datasets provides evidence of its structure recovery ability and superior effectiveness and efficiency as compared to the state-of-the-art alternatives.

artificial intelligence, knowledge, machine learning, (17 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (1.00)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.93)
Energy (0.68)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

A Nearly-Black-Box Online Algorithm for Joint Parameter and State Estimation in Temporal Models

Erol, Yusuf Bugra (University of California, Berkeley) | Wu, Yi (University of California, Berkeley) | Li, Lei (Toutiao Lab) | Russell, Stuart (University of California, Berkeley)

Online joint parameter and state estimation is a core problem for temporal models.Most existing methods are either restricted to a particular class of models (e.g., the Storvik filter) or computationally expensive (e.g., particle MCMC). We propose a novel nearly-black-box algorithm, the Assumed Parameter Filter (APF), a hybrid of particle filtering for state variables and assumed density filtering for parameter variables.It has the following advantages:(a) it is online and computationally efficient;(b) it is applicable to both discrete and continuous parameter spaces with arbitrary transition dynamics.On a variety of toy and real models, APF generates more accurate results within a fixed computation budget compared to several standard algorithms from the literature.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Industry:

Transportation > Air (0.62)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Chen, Gang (State University of New York at Buffalo)

Latent Discriminant Analysis with Representative Feature Discovery

Linear Discriminant Analysis (LDA) is a well-known method for dimension reduction and classification with focus on discriminative feature selection. However, how to discover discriminative as well as representative features in LDA model has not been explored. In this paper, we propose a latent Fisher discriminant model with representative feature discovery in an semi-supervised manner. Specifically, our model leverages advantages of both discriminative and generative models by generalizing LDA with data-driven prior over the latent variables. Thus, our method combines multi-class, latent variables and dimension reduction in an unified Bayesian framework. We test our method on MUSK and Corel datasets and yield competitive results compared to baselines. We also demonstrate its capacity on the challenging TRECVID MED11 dataset for semantic keyframe extraction and conduct a human-factors ranking-based experimental evaluation, which clearly demonstrates our proposed method consistently extracts more semantically meaningful keyframes than challenging baselines.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.68)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Learning Visual Sentiment Distributions via Augmented Conditional Probability Neural Network

Yang, Jufeng (Nankai University) | Sun, Ming (Nankai University) | Sun, Xiaoxiao (Nankai University)

Visual sentiment analysis is raising more and more attention with the increasing tendency to express emotions through images. While most existing works assign a single dominant emotion to each image, we address the sentiment ambiguity by label distribution learning (LDL), which is motivated by the fact that image usually evokes multiple emotions. Two new algorithms are developed based on conditional probability neural network (CPNN). First, we proposed BCPNN which encodes image label into a binary representation to replace the signless integers used in CPNN, and employ it as a part of input for the neural network. Then, we train our ACPNN model by adding noises to ground truth label and augmenting affective distributions. Since current datasets are mostly annotated for single-label learning, we build two new datasets, one of which is relabeled on the popular Flickr dataset and the other is collected from Twitter. These datasets contain 20,745 images with multiple affective labels, which are over ten times larger than the existing ones. Experimental results show that the proposed methods outperform the state-of-the-art works on our large-scale datasets and other publicly available benchmarks.

artificial intelligence, bayesian inference, machine learning, (20 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Germain, Pascal, Bach, Francis, Lacoste, Alexandre, Lacoste-Julien, Simon

PAC-Bayesian Theory Meets Bayesian Inference

arXiv.org Machine LearningFeb-13-2017

We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.d distribution. Moreover, as the negative log-likelihood is an unbounded loss function, we motivate and propose a PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that our approach is sound on classical Bayesian linear regression tasks.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1605.08636

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ganzfried, Sam, Sun, Qingyun

Bayesian Opponent Exploitation in Imperfect-Information Games

arXiv.org Artificial IntelligenceFeb-13-2017

Two fundamental problems in computational game theory are computing a Nash equilibrium and learning to exploit opponents given observations of their play (opponent exploitation). The latter is perhaps even more important than the former: Nash equilibrium does not have a compelling theoretical justification in game classes other than two-player zero-sum, and for all games one can potentially do better by exploiting perceived weaknesses of the opponent than by following a static equilibrium strategy throughout the match. The natural setting for opponent exploitation is the Bayesian setting where we have a prior model that is integrated with observations to create a posterior opponent model that we respond to. The most natural, and a well-studied prior distribution is the Dirichlet distribution. An exact polynomial-time algorithm is known for best-responding to the posterior distribution for an opponent assuming a Dirichlet prior with multinomial sampling in normal-form games; however, for imperfect-information games the best known algorithm is based on approximating an infinite integral without theoretical guarantees. We present the first exact algorithm for a natural class of imperfect-information games. We demonstrate that our algorithm runs quickly in practice and outperforms the best prior approaches. We also present an algorithm for the uniform prior setting.

artificial intelligence, machine learning, opponent, (19 more...)

arXiv.org Artificial Intelligence

1603.03491

Country: North America (0.28)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

#artificialintelligenceFeb-12-2017, 22:20:25 GMT

[系列活動] Machine Learning 機器學習課程

Our learning task: Given a training set S {(x1 , y1), (x2 , y2), . . . The simplest case: y { 1, 1} called binary classification problem If y is a real number it becomes a regression problem More general case, y can be a vector and each element is drawn from a finite set.

artificial intelligence, machine learning, optimization support vector machine evaluation, (13 more...)

#artificialintelligence

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.96)
(3 more...)

#artificialintelligenceFeb-10-2017, 12:05:16 GMT

Introduction to Machine Learning

The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data. Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts. Subjects include supervised learning; Bayesian decision theory; parametric, semi-parametric, and nonparametric methods; multivariate analysis; hidden Markov models; reinforcement learning; kernel machines; graphical models; Bayesian estimation; and statistical testing. Machine learning is rapidly becoming a skill that computer science students must master before graduation.

artificial intelligence, bayesian inference, machine learning, (5 more...)

#artificialintelligence

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)

Industry: Education (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.79)

arXiv.org Artificial IntelligenceFeb-10-2017

Multigrid with rough coefficients and Multiresolution operator decomposition from Hierarchical Information Games

Owhadi, Houman

We introduce a near-linear complexity (geometric and meshless/algebraic) multigrid/multiresolution method for PDEs with rough ($L^\infty$) coefficients with rigorous a-priori accuracy and performance estimates. The method is discovered through a decision/game theory formulation of the problems of (1) identifying restriction and interpolation operators (2) recovering a signal from incomplete measurements based on norm constraints on its image under a linear operator (3) gambling on the value of the solution of the PDE based on a hierarchy of nested measurements of its solution or source term. The resulting elementary gambles form a hierarchy of (deterministic) basis functions of $H^1_0(\Omega)$ (gamblets) that (1) are orthogonal across subscales/subbands with respect to the scalar product induced by the energy norm of the PDE (2) enable sparse compression of the solution space in $H^1_0(\Omega)$ (3) induce an orthogonal multiresolution operator decomposition. The operating diagram of the multigrid method is that of an inverted pyramid in which gamblets are computed locally (by virtue of their exponential decay), hierarchically (from fine to coarse scales) and the PDE is decomposed into a hierarchy of independent linear systems with uniformly bounded condition numbers. The resulting algorithm is parallelizable both in space (via localization) and in bandwith/subscale (subscales can be computed independently from each other). Although the method is deterministic it has a natural Bayesian interpretation under the measure of probability emerging (as a mixed strategy) from the information game formulation and multiresolution approximations form a martingale with respect to the filtration induced by the hierarchy of nested measurements.

game theory, matrix, upstream oil & gas, (22 more...)

arXiv.org Artificial Intelligence

1503.03467

Country:

Europe > Netherlands (0.27)
North America > United States > California (0.27)
North America > United States > Pennsylvania (0.14)
(3 more...)

Genre: Research Report (0.40)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Government (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Mathematics of Computing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
(2 more...)