AITopics

0811.4413

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Dahl, George E., Adams, Ryan P., Larochelle, Hugo

Training Restricted Boltzmann Machines on Word Observations

arXiv.org Machine LearningJul-5-2012

The restricted Boltzmann machine (RBM) is a flexible tool for modeling complex data, however there have been significant computational difficulties in using RBMs to model high-dimensional multinomial observations. In natural language processing applications, words are naturally modeled by K-ary discrete distributions, where K is determined by the vocabulary size and can easily be in the hundreds of thousands. The conventional approach to training RBMs on word observations is limited because it requires sampling the states of K-way softmax visible units during block Gibbs updates, an operation that takes time linear in K. In this work, we address this issue by employing a more general class of Markov chain Monte Carlo operators on the visible units, yielding updates with computational complexity independent of K. We demonstrate the success of our approach by training RBMs on hundreds of millions of word n-grams using larger vocabularies than previously feasible and using the learned features to improve performance on chunking and sentiment classification tasks, achieving state-of-the-art results on the latter.

deep learning, neural network, representation, (19 more...)

1202.5695

Country:

Asia (0.93)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

arXiv.org Artificial IntelligenceJul-5-2012

Higher-Order Partial Least Squares (HOPLS): A Generalized Multi-Linear Regression Method

Zhao, Qibin, Caiafa, Cesar F., Mandic, Danilo P., Chao, Zenas C., Nagasaka, Yasuo, Fujii, Naotaka, Zhang, Liqing, Cichocki, Andrzej

A new generalized multilinear regression model, termed the Higher-Order Partial Least Squares (HOPLS), is introduced with the aim to predict a tensor (multiway array) $\tensor{Y}$ from a tensor $\tensor{X}$ through projecting the data onto the latent space and performing regression on the corresponding latent variables. HOPLS differs substantially from other regression models in that it explains the data by a sum of orthogonal Tucker tensors, while the number of orthogonal loadings serves as a parameter to control model complexity and prevent overfitting. The low dimensional latent space is optimized sequentially via a deflation operation, yielding the best joint subspace approximation for both $\tensor{X}$ and $\tensor{Y}$. Instead of decomposing $\tensor{X}$ and $\tensor{Y}$ individually, higher order singular value decomposition on a newly defined generalized cross-covariance tensor is employed to optimize the orthogonal loadings. A systematic comparison on both synthetic data and real-world decoding of 3D movement trajectories from electrocorticogram (ECoG) signals demonstrate the advantages of HOPLS over the existing methods in terms of better predictive ability, suitability to handle small sample sizes, and robustness to noise.

health & medicine, hopl, neurology, (21 more...)

doi: 10.1109/TPAMI.2012.254.

1207.123

Country:

Asia (0.68)
North America > United States (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceJul-5-2012

Super-Mixed Multiple Attribute Group Decision Making Method Based on Hybrid Fuzzy Grey Relation Approach Degree

Kim, Gol, Ye, Fei

The feature of our method different from other fuzzy grey relation method for supermixed multiple attribute group decision-making is that all of the subjective and objective weights are obtained by interval grey number and that the group decisionmaking is performed based on the relative approach degree of grey TOPSIS, the relative approach degree of grey incidence and the relative membership degree of grey incidence using 4-dimensional Euclidean distance. The weighted Borda method is used to obtain final rank by using the results of four methods. An example shows the applicability of the proposed approach.

artificial intelligence, decision matrix, optimization problem, (16 more...)

1207.1501

Country: Asia > China (0.29)

Industry: Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Chen, Hubie, Gimenez, Omer

On-the-fly Macros

arXiv.org Artificial IntelligenceJul-5-2012

We present a domain-independent algorithm that computes macros in a novel way. Our algorithm computes macros "on-the-fly" for a given set of states and does not require previously learned or inferred information, nor prior domain knowledge. The algorithm is used to define new domain-independent tractable classes of classical planning that are proved to include \emph{Blocksworld-arm} and \emph{Towers of Hanoi}.

algorithm, artificial intelligence, neural network, (15 more...)

0810.1186

Country:

Asia > Vietnam > Hanoi > Hanoi (0.26)
Europe > Spain (0.14)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Ho, Shen-Shyang, Wechsler, Harry

On the Detection of Concept Changes in Time-Varying Data Stream by Testing Exchangeability

A martingale framework for concept change detection based on testing data exchangeability was recently proposed (Ho, 2005). In this paper, we describe the proposed change-detection test based on the Doob's Maximal Inequality and show that it is an approximation of the sequential probability ratio test (SPRT). The relationship between the threshold value used in the proposed test and its size and power is deduced from the approximation. The mean delay time before a change is detected is estimated using the average sample number of a SPRT. The performance of the test using various threshold values is examined on five different data stream scenarios simulated using two synthetic data sets. Finally, experimental results show that the test is effective in detecting changes in time-varying data streams simulated using three benchmark data sets.

artificial intelligence, data stream, machine learning, (15 more...)

1207.1379

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Savia, Eerika, Puolamaki, Kai, Sinkkonen, Janne, Kaski, Samuel

Two-Way Latent Grouping Model for User Preference Prediction

We introduce a novel latent grouping model for predicting the relevance of a new document to a user. The model assumes a latent group structure for both users and documents. We compared the model against a state-of-the-art method, the User Rating Profile model, where only users have a latent group structure. We estimate both models by Gibbs sampling. The new method predicts relevance more accurately for new documents that have few known ratings. The reason is that generalization over documents then becomes necessary and hence the twoway grouping is profitable.

artificial intelligence, bayesian inference, user group, (19 more...)

1207.1414

Country:

Europe > Finland (0.15)
North America > United States (0.14)

Genre: Research Report > Experimental Study (0.69)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Guo, Yuhong, Wilkinson, Dana, Schuurmans, Dale

Maximum Margin Bayesian Networks

We consider the problem of learning Bayesian network classifiers that maximize the marginover a set of classification variables. We find that this problem is harder for Bayesian networks than for undirected graphical models like maximum margin Markov networks. The main difficulty is that the parameters in a Bayesian network must satisfy additional normalization constraints that an undirected graphical model need not respect. These additional constraints complicate the optimization task. Nevertheless, we derive an effective training algorithm that solves the maximum margin training problem for a range of Bayesian network topologies, and converges to an approximate solution for arbitrary network topologies. Experimental results show that the method can demonstrate improved generalization performance over Markov networks when the directed graphical structure encodes relevant knowledge. In practice, the training technique allows one to combine prior knowledge expressed as a directed (causal) model with state of the art discriminative learning methods.

bayesian inference, constraint, health & medicine, (19 more...)

1207.1382

Country: North America > Canada > Alberta (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Morvant, Emilie, Habrard, Amaury, Ayache, Stéphane

PAC-Bayesian Majority Vote for Late Classifier Fusion

A lot of attention has been devoted to multimedia indexing over the past few years. In the literature, we often consider two kinds of fusion schemes: The early fusion and the late fusion. In this paper we focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic program named MinCq coming from the Machine Learning PAC-Bayes theory. MinCq looks for the weighted combination, over a set of real-valued functions seen as voters, leading to the lowest misclassification rate, while making use of the voters' diversity. We provide evidence that this method is naturally adapted to late fusion procedure. We propose an extension of MinCq by adding an order- preserving pairwise loss for ranking, helping to improve Mean Averaged Precision measure. We confirm the good behavior of the MinCq-based fusion approaches with experiments on a real image benchmark.

artificial intelligence, information fusion, mincq, (15 more...)

1207.1019

Country: Europe > France (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Abbeel, Pieter, Koller, Daphne, Ng, Andrew Y.

Learning Factor Graphs in Polynomial Time & Sample Complexity

We study computational and sample complexity of parameter and structure learning in graphical models. Our main result shows that the class of factor graphs with bounded factor size and bounded connectivity can be learned in polynomial time and polynomial number of samples, assuming that the data is generated by a network in this class. This result covers both parameter estimation for a known network structure and structure learning. It implies as a corollary that we can learn factor graphs for both Bayesian networks and Markov networks of bounded degree, in polynomial time and sample complexity. Unlike maximum likelihood estimation, our method does not require inference in the underlying network, and so applies to networks where inference is intractable. We also show that the error of our learned model degrades gracefully when the generating distribution is not a member of the target class of networks.

artificial intelligence, machine learning, markov blanket, (16 more...)

1207.1366

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)