AITopics

Inspired by the Bayesian brain hypothesis and deep learning, we develop a Bayesian autoencoder, a method of constructing recognition systems using a Bayesian network. We construct hierarchical Bayesian networks based on feature extraction and implement pooling to achieve invariance within a Bayesian network framework. The constructed networks propagate information bidirectionally between layers. We expect they will be able to achieve brain-like recognition using local features and global information such as their environments.

artificial intelligence, bae net, machine learning, (15 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Semi-Supervised Bayesian Attribute Learning for Person Re-Identification

Liu, Wenhe (University of Technology, Sydney) | Chang, Xiaojun (Carnegie Mellon University) | Chen, Ling (University of Technology, Sydney) | Yang, Yi (University of Technology, Sydney)

Person re-identification (re-ID) tasks aim to identify the same person in multiple images captured from non-overlapping camera views. Most previous re-ID studies have attempted to solve this problem through either representation learning or metric learning, or by combining both techniques. Representation learning relies on the latent factors or attributes of the data. In most of these works, the dimensionality of the factors/attributes has to be manually determined for each new dataset. Thus, this approach is not robust. Metric learning optimizes a metric across the dataset to measure similarity according to distance. However, choosing the optimal method for computing these distances is data dependent, and learning the appropriate metric relies on a sufficient number of pair-wise labels. To overcome these limitations, we propose a novel algorithm for person re-ID, called semi-supervised Bayesian attribute learning. We introduce an Indian Buffet Process to identify the priors of the latent attributes. The dimensionality of attributes factors is then automatically determined by nonparametric Bayesian learning. Meanwhile, unlike traditional distance metric learning, we propose a re-identification probability distribution to describe how likely it is that a pair of images contains the same person. This technique relies solely on the latent attributes of both images. Moreover, pair-wise labels that are not known can be estimated from pair-wise labels that are known, making this a robust approach for semi-supervised learning. Extensive experiments demonstrate the superior performance of our algorithm over several state-of-the-art algorithms on small-scale datasets and comparable performance on large-scale re-ID datasets.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Asymmetric Joint Learning for Heterogeneous Face Recognition

Cao, Bing (Xidian University) | Wang, Nannan (Xidian University) | Gao, Xinbo (Xidian University) | Li, Jie (Xidian University)

Heterogeneous face recognition (HFR) refers to matching a probe face image taken from one modality to face images acquired from another modality. It plays an important role in security scenarios. However, HFR is still a challenging problem due to great discrepancies between cross-modality images. This paper proposes an asymmetric joint learning (AJL) approach to handle this issue. The proposed method transforms the cross-modality differences mutually by incorporating the synthesized images into the learning process which provides more discriminative information. Although the aggregated data would augment the scale of intra-classes, it also reduces the diversity (i.e. discriminative information) for inter-classes. Then, we develop the AJL model to balance this dilemma. Finally, we could obtain the similarity score between two heterogeneous face images through the log-likelihood ratio. Extensive experiments on viewed sketch database, forensic sketch database and near infrared image database illustrate that the proposed AJL-HFR method achieve superior performance in comparison to state-of-the-art methods.

artificial intelligence, face image, machine learning, (16 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.29)

Industry: Information Technology > Security & Privacy (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Towards Training Probabilistic Topic Models on Neuromorphic Multi-Chip Systems

Xiao, Zihao (Tsinghua University) | Chen, Jianfei (Tsinghua University) | Zhu, Jun (Tsinghua University)

Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models.The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively.They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.

artificial intelligence, machine learning, natural language, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (1.00)

Genre:

Research Report (0.68)
Overview (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(3 more...)

Shen, Yujia (University of California, Los Angeles) | Choi, Arthur (University of California, Los Angeles) | Darwiche, Adnan (University of California, Los Angeles)

Conditional PSDDs: Modeling and Learning With Modular Knowledge

Probabilistic Sentential Decision Diagrams (PSDDs) have been proposed for learning tractable probability distributions from a combination of data and background knowledge (in the form of Boolean constraints). In this paper, we propose a variant on PSDDs, called conditional PSDDs, for representing a family of distributions that are conditioned on the same set of variables. Conditional PSDDs can also be learned from a combination of data and (modular) background knowledge. We use conditional PSDDs to define a more structured version of Bayesian networks, in which nodes can have an exponential number of states, hence expanding the scope of domains where Bayesian networks can be applied. Compared to classical PSDDs, the new representation exploits the independencies captured by a Bayesian network to decompose the learning process into localized learning tasks, which enables the learning of better models while using less computation. We illustrate the promise of conditional PSDDs and structured Bayesian networks empirically, and by providing a case study to the modeling of distributions over routes on a map.

artificial intelligence, constraint, machine learning, (17 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Relational Marginal Problems: Theory and Estimation

Kuželka, Ondřej (Cardiff University) | Wang, Yuyi (ETH Zurich) | Davis, Jesse (KU Leuven) | Schockaert, Steven (Cardiff University)

In the propositional setting, the marginal problem is to find a (maximum-entropy) distribution that has some given marginals. We study this problem in a relational setting and make the following contributions. First, we compare two different notions of relational marginals. Second, we show a duality between the resulting relational marginal problems and the maximum likelihood estimation of the parameters of relational models, which generalizes a well-known duality from the propositional setting. Third, by exploiting the relational marginal formulation, we present a statistically sound method to learn the parameters of relational models that will be applied in settings where the number of constants differs between the training and test data. Furthermore, based on a relational generalization of marginal polytopes, we characterize cases where the standard estimators based on feature's number of true groundings needs to be adjusted and we quantitatively characterize the consequences of these adjustments. Fourth, we prove bounds on expected errors of the estimated parameters, which allows us to lower-bound, among other things, the effective sample size of relational training data.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.28)
Europe > Switzerland (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Gupta, Tarun (International Institute of Information Technology, Hyderabad) | Kumar, Akshat (Singapore Management University) | Paruchuri, Praveen (International Institute of Information Technology, Hyderabad)

Planning and Learning for Decentralized MDPs With Event Driven Rewards

Decentralized (PO)MDPs provide a rigorous framework for sequential multiagent decision making under uncertainty. However, their high computational complexity limits the practical impact. To address scalability and real-world impact, we focus on settings where a large number of agents primarily interact through complex joint-rewards that depend on their entire histories of states and actions. Such history-based rewards encapsulate the notion of events or tasks such that the team reward is given only when the joint-task is completed. Algorithmically, we contribute---1) A nonlinear programming (NLP) formulation for such event-based planning model; 2) A probabilistic inference based approach that scales much better than NLP solvers for a large number of agents; 3) A policy gradient based multiagent reinforcement learning approach that scales well even for exponential state-spaces. Our inference and RL-based advances enable us to solve a large real-world multiagent coverage problem modeling schedule coordination of agents in a real urban subway network where other approaches fail to scale.

Xing, Linzi (University of Colorado, Boulder) | Paul, Michael J. (University of Colorado, Boulder)

Diagnosing and Improving Topic Models by Analyzing Posterior Variability

Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters, which has primarily been used to increase the robustness of parameter estimates. In this work, we explore other rich information that can be obtained by analyzing the posterior distributions in topic models. Experimenting with latent Dirichlet allocation on two datasets, we propose ideas incorporating information about the posterior distributions at the topic level and at the word level. At the topic level, we propose a metric called topic stability that measures the variability of the topic parameters under the posterior. We show that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. At the word level, we experiment with different methods for adjusting individual word probabilities within topics based on their uncertainty. Humans prefer words ranked by our adjusted estimates nearly twice as often when compared to the traditional approach. Finally, we describe how the ideas presented in this work could potentially applied to other predictive or exploratory models in future work.

artificial intelligence, natural language, topic model, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > Colorado > Boulder County > Boulder (0.14)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization

Wang, Yijun (University of Science and Technology of China) | Xia, Yingce (University of Science and Technology of China) | Zhao, Li (Microsoft Research Asia) | Bian, Jiang (Microsoft Research Asia) | Qin, Tao (Microsoft Research Asia) | Liu, Guiquan (University of Science and Technology of China) | Liu, Tie-Yan (Microsoft Research Asia)

Neural machine translation (NMT) heavily relies on parallel bilingual data for training. Since large-scale, high-quality parallel corpora are usually costly to collect, it is appealing to exploit monolingual corpora to improve NMT. Inspired by the law of total probability, which connects the probability of a given target-side monolingual sentence to the conditional probability of translating from a source sentence to the target one, we propose to explicitly exploit this connection to learn from and regularize the training of NMT models using monolingual data. The key technical challenge of this approach is that there are exponentially many source sentences for a target monolingual sentence while computing the sum of the conditional probability given each possible source sentence. We address this challenge by leveraging the dual translation model (target-to-source translation) to sample several mostly likely source-side sentences and avoid enumerating all possible candidate source sentences. That is, we transfer the knowledge contained in the dual model to boost the training of the primal model (source-to-target translation), and we call such an approach dual transfer learning. Experiment results on English-French and German-English tasks demonstrate that dual transfer learning achieves significant improvement over several strong baselines and obtains new state-of-the-art results.

machine learning, natural language, translation, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Geometric Relationship between Word and Context Representations

Feng, Jiangtao (Fudan University) | Zheng, Xiaoqing (Fudan University)

Pre-trained distributed word representations have been proven to be useful in various natural language processing (NLP) tasks. However, the geometric basis of word representations and their relations to the representations of word's contexts has not been carefully studied yet. In this study, we first investigate such geometric relationship under a general framework, which is abstracted from some typical word representation learning approaches, and find out that only the directions of word representations are well associated to their context vector representations while the magnitudes are not. In order to make better use of the information contained in the magnitudes of word representations, we propose a hierarchical Gaussian model combined with maximum a posteriori estimation to learn word representations, and extend it to represent polysemous words. Our word representations have been evaluated on multiple NLP tasks, and the experimental results show that the proposed model achieved promising results, comparing to several popular word representations.

artificial intelligence, machine learning, natural language, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)