AITopics

Restricted Boltzmann machines (RBMs) are a powerful generative modeling technique, based on a complex graphical model of hidden (latent) variables. Conditional RBMs (CRBMs) are an extension of RBMs tailored to modeling temporal data. A drawback of CRBMs is their consideration of linear temporal dependencies, which limits their capability to capture complex temporal structure. They also require many variables to model long temporal dependencies, a fact that might provoke overfitting proneness. To resolve these issues, in this paper we propose the echo-state CRBM (ES-CRBM): our model uses an echo-state network reservoir in the context of CRBMs to efficiently capture long and complex temporal dynamics, with much fewer trainable parameters compared to conventional CRBMs. In addition, we introduce an (implicit) mixture of ES-CRBM experts (im-ES-CRBM) to enhance even further the capabilities of our ES-CRBM model. The introduced im-ES-CRBM allows for better modeling temporal observations which might comprise a number of latent or observable subpatterns that alternate in a dynamic fashion. It also allows for performing sequence segmentation using our framework. We apply our methods to sequential data modeling and classification experiments using public datasets. As we show, our approach outperforms both existing RBM-based approaches as well as related state-of-the-art methods, such as conditional random fields.

artificial intelligence, experiment, machine learning, (17 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Chatzis, Sotirios (Cyprus University of Technology)

Dynamic Bayesian Probabilistic Matrix Factorization

Collaborative filtering algorithms generally rely on the assumption that user preference patterns remain stationary. However, real-world relational data are seldom stationary. User preference patterns may change over time, giving rise to the requirement of designing collaborative filtering systems capable of detecting and adapting to preference pattern shifts. Motivated by this observation, in this paper we propose a dynamic Bayesian probabilistic matrix factorization model, designed for modeling time-varying distributions. Formulation of our model is based on imposition of a dynamic hierarchical Dirichlet process (dHDP) prior over the space of probabilistic matrix factorization models to capture the time-evolving statistical properties of modeled sequential relational datasets. We develop a simple Markov Chain Monte Carlo sampler to perform inference. We present experimental results to demonstrate the superiority of our temporal model.

artificial intelligence, dataset, machine learning, (14 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Africa > Senegal > Kolda Region > Kolda (0.04)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bi, Wei (Hong Kong University of Science and Technology) | Kwok, James T (Hong Kong University of Science and Technology)

Multilabel Classification with Label Correlations and Missing Labels

Many real-world applications involve multilabel classification, in which the labels can have strong inter-dependencies and some of them may even be missing.Existing multilabel algorithms are unable to handle both issues simultaneously.In this paper, we propose a probabilistic model that can automatically learn and exploit multilabel correlations.By integrating out the missing information, it also provides a disciplinedapproach to the handling of missing labels. The inference procedure is simple, and the optimization subproblems are convex. Experiments on a number of real-world data sets with both complete and missing labelsdemonstrate that the proposed algorithm can consistently outperform state-of-the-art multilabel classification algorithms.

artificial intelligence, classification, machine learning, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Xia, Rui (Nanjing University of Science and Technology) | Yu, Jianfei (Nanjing University of Science and Technology) | Xu, Feng (Nanjing University of Science and Technology) | Wang, Shumei (Nanjing University of Science and Technology)

Instance-Based Domain Adaptation in NLP via In-Target-Domain Logistic Approximation

In the field of NLP, most of the existing domain adaptation studies belong to the feature-based adaptation, while the research of instance-based adaptation is very scarce. In this work, we propose a new instance-based adaptation model, called in-target-domain logistic approximation (ILA). In ILA, we adapt the source-domain data to the target domain by a logistic approximation. The normalized in-target-domain probability is assigned as an instance weight to each of the source-domain training data. An instance-weighted classification model is trained finally for the cross-domain classification problem. Compared to the previous techniques, ILA conducts instance adaptation in a dimensionality-reduced linear feature space to ensure efficiency in high-dimensional NLP tasks. The instance weights in ILA are learnt by leveraging the criteria of both maximum likelihood and minimum statistical distance. The empirical results on two NLP tasks including text categorization and sentiment classification show that our ILA model beats the state-of-the-art instance adaptation methods significantly, in cross-domain classification accuracy, parameter stability and computational efficiency.

artificial intelligence, machine learning, natural language, (16 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Learning Scripts as Hidden Markov Models

Orr, John Walker (Oregon State Univserity) | Tadepalli, Prasad (Oregon State Univserity) | Doppa, Janardhan Rao (Oregon State University) | Fern, Xiaoli (Oregon State University) | Dietterich, Thomas G. (Oregon State Univserity)

Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including fillinggaps in the narratives and resolving ambiguous references. This paper proposes the first formal frameworkfor scripts based on Hidden Markov Models (HMMs). Our framework supports robust inference and learning algorithms, which are lacking in previous clustering models. We develop an algorithm for structure andparameter learning based on Expectation Maximizationand evaluate it on a number of natural datasets. The results show that our algorithm is superior to several informed baselines for predicting missing events in partialobservation sequences.

algorithm, artificial intelligence, machine learning, (17 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > Oregon > Benton County > Corvallis (0.04)

Genre: Research Report (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Multiagent Metareasoning through Organizational Design

Sleight, Jason (University of Michigan) | Durfee, Edmund H. (University of Michigan)

We formulate an approach to multiagent metareasoning that uses organizational design to focus each agent's reasoning on the aspects of its local problem that let it make the most worthwhile contributions to joint behavior. By employing the decentralized Markov decision process framework, we characterize an organizational design problem that explicitly considers the quantitative impact that a design has on both the quality of the agents' behaviors and their reasoning costs. We describe an automated organizational design process that can approximately solve our organizational design problem via incremental search, and present techniques that efficiently estimate the incremental impact of a candidate organizational influence. Our empirical evaluation confirms that our process generates organizational designs that impart a desired metareasoning regime upon the agents.

agent, artificial intelligence, machine learning, (18 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Wang, Dong (Nanjing University of Aeronautics and Astronautics) | Tan, Xiaoyang (Nanjing University of Aeronautics and Astronautics)

Robust Distance Metric Learning in the Presence of Label Noise

Many distance learning algorithms have been developed in recent years. However, few of them consider the problem when the class labels of training data are noisy, and this may lead to serious performance deterioration. In this paper, we present a robust distance learning method in the presence of label noise, by extending a previous non-parametric discriminative distance learning algorithm, i.e., Neighbourhood Components Analysis (NCA). Particularly, we analyze the effect of label noise on the derivative of likelihood with respect to the transformation matrix, and propose to model the conditional probability of the true label of each point so as to reduce that effect. The model is then optimized within the EM framework, with additional regularization used to avoid overfitting. Our experiments on several UCI datasets and a real dataset with unknown noise patterns show that the proposed RNCA is more tolerant to class label noise compared to the original NCA method.

artificial intelligence, bayesian inference, machine learning, (14 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Agent Behavior Prediction and Its Generalization Analysis

Tian, Fei (University of Science and Technology of China) | Li, Haifang (Chinese Academy of Sciences) | Chen, Wei (Microsoft Research) | Qin, Tao (Microsoft Research) | Chen, Enhong (University of Science and Technology of China) | Liu, Tie-Yan (Microsoft Research)

Machine learning algorithms have been applied to predict agent behaviors in real-world dynamic systems, such as advertiser behaviors in sponsored search and worker behaviors in crowdsourcing. Behavior data in these systems are generated by live agents: once systems change due to adoption of prediction models learnt from behavior data, agents will observe and respond to these changes by changing their own behaviors accordingly. Therefore, the evolving behavior data will not be identically and independently distributed, posing great challenges to theoretical analysis. To tackle this challenge, in this paper, we propose to use Markov Chain in Random Environments (MCRE) to describe the behavior data, and perform generalization analysis of machine learning algorithms on its basis. We propose a novel technique that transforms the original time-variant MCRE into a higher-dimensional time-homogeneous Markov chain, which is easier to deal with. We prove the convergence of the new Markov chain when time approaches infinity. Then we obtain a generalization bound for the machine learning algorithms on the behavior data generated by the new Markov chain. To the best of our knowledge, this is the first work that performs the generalization analysis on data generated by complex processes in real-world dynamic systems.

artificial intelligence, machine learning, markov chain, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Ramesh, Arti (University Of Maryland, College Park) | Goldwasser, Dan (University of Maryland, College Park) | Huang, Bert (University of Maryland, College Park) | III, Hal Daume (University of Maryland, College Park) | Getoor, Lise (University of California, Santa Cruz)

Learning Latent Engagement Patterns of Students in Online Courses

Maintaining and cultivating student engagement is critical for learning. Understanding factors affecting student engagement will help in designing better courses and improving student retention. The large number of participants in massive open online courses (MOOCs) and data collected from their interaction with the MOOC open up avenues for studying student engagement at scale. In this work, we develop a framework for modeling and understanding student engagement in online courses based on student behavioral cues. Our first contribution is the abstraction of student engagement types using latent representations and using that in a probabilistic model to connect student behavior with course completion. We demonstrate that the latent formulation for engagement helps in predicting student survival across three MOOCs. Next, in order to initiate better instructor interventions, we need to be able to predict student survival early in the course. We demonstrate that we can predict student survival early in the course reliably using the latent model. Finally, we perform a closer quantitative analysis of user interaction with the MOOC and identify student activities that are good indicators for survival at different points in the course.

artificial intelligence, machine learning, student, (17 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.28)

Genre:

Instructional Material > Online (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Discovering Better AAAI Keywords via Clustering with Community-Sourced Constraints

Moran, Kelly H. (Google Inc.) | Wallace, Byron C. (Brown University) | Brodley, Carla E. (Tufts University)

Selecting good conference keywords is important because they often determine the composition of review committees and hence which papers are reviewed by whom. But presently conference keywords are generated in an ad-hoc manner by a small set of conference organizers. This approach is plainly not ideal. There is no guarantee, for example, that the generated keyword set aligns with what the community is actually working on and submitting to the conference in a given year. This is especially true in fast moving fields such as AI. The problem is exacerbated by the tendency of organizers to draw heavily on preceding years' keyword lists when generating a new set. Rather than a select few ordaining a keyword set that that represents AI at large, it would be preferable to generate these keywords more directly from the data, with input from research community members. To this end, we solicited feedback from seven AAAI PC members regarding a previously existing keyword set and used these 'community-sourced constraints' to inform a clustering over the abstracts of all submissions to AAAI 2013. We show that the keywords discovered via this data-driven, human-in-the-loop method are at least as preferred (by AAAI PC members) as 2013's manually generated set, and that they include categories previously overlooked by organizers. Many of the discovered terms were used for this year's conference.

data mining, machine learning, natural language, (22 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.95)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
(4 more...)