AITopics

doi: 10.1007/978-3-319-43425-4_1

1606.01111

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Goessling, Marc, Amit, Yali

Compact Compositional Models

arXiv.org Machine LearningOct-29-2016

Learning compact and interpretable representations is a very natural task, which has not been solved satisfactorily even for simple binary datasets. In this paper, we review various ways of composing experts for binary data and argue that competitive forms of interaction are best suited to learn low-dimensional representations. We propose a new composition rule that discourages experts from focusing on similar structures and that penalizes opposing votes strongly so that abstaining from voting becomes more attractive. We also introduce a novel sequential initialization procedure, which is based on a process of oversimplification and correction. Experiments show that with our approach very intuitive models can be learned.

artificial intelligence, composition rule, machine learning, (16 more...)

1412.3708

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

#artificialintelligenceOct-28-2016, 21:25:34 GMT

The 10 Algorithms Machine Learning Engineers Need to Know

It is no doubt that the sub-field of machine learning / artificial intelligence has increasingly gained more popularity in the past couple of years. As Big Data is the hottest trend in the tech industry at the moment, machine learning is incredibly powerful to make predictions or calculated suggestions based on large amounts of data. Some of the most common examples of machine learning are Netflix's algorithms to make movie suggestions based on movies you have watched in the past or Amazon's algorithms that recommend books based on books you have bought before. So if you want to learn more about machine learning, how do you start? For me, my first introduction is when I took an Artificial Intelligence class when I was studying abroad in Copenhagen.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Country:

Europe > Denmark > Capital Region > Copenhagen (0.25)
North America > United States > California > San Francisco County > San Francisco (0.05)

Industry:

Media (0.55)
Information Technology > Services (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Peharz, Robert, Gens, Robert, Pernkopf, Franz, Domingos, Pedro

On the Latent Variable Interpretation in Sum-Product Networks

arXiv.org Artificial IntelligenceOct-28-2016

One of the central themes in Sum-Product networks (SPNs) is the interpretation of sum nodes as marginalized latent variables (LVs). This interpretation yields an increased syntactic or semantic structure, allows the application of the EM algorithm and to efficiently perform MPE inference. In literature, the LV interpretation was justified by explicitly introducing the indicator variables corresponding to the LVs' states. However, as pointed out in this paper, this approach is in conflict with the completeness condition in SPNs and does not fully specify the probabilistic model. We propose a remedy for this problem by modifying the original approach for introducing the LVs, which we call SPN augmentation. We discuss conditional independencies in augmented SPNs, formally establish the probabilistic interpretation of the sum-weights and give an interpretation of augmented SPNs as Bayesian networks. Based on these results, we find a sound derivation of the EM algorithm for SPNs. Furthermore, the Viterbi-style algorithm for MPE proposed in literature was never proven to be correct. We show that this is indeed a correct algorithm, when applied to selective SPNs, and in particular when applied to augmented SPNs. Our theoretical results are confirmed in experiments on synthetic data and 103 real-world datasets.

artificial intelligence, machine learning, spn, (14 more...)

arXiv.org Artificial Intelligence

1601.0618

Country:

North America > United States > Washington > King County (0.28)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Personal > Honors (0.46)

Industry:

Health & Medicine (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)

Pan, Indranil, Bester, Dirk

Fuzzy Bayesian Learning

Abstract--In this paper we propose a novel approach for learning from data using rule based fuzzy inference systems where the model parameters are estimated using Bayesian inference and Markov Chain Monte Carlo (MCMC) techniques. We show the applicability of the method for regression and classification tasks using synthetic data-sets and also a real world example in the financial services industry. Then we demonstrate how the method can be extended for knowledge extraction to select the individual rules in a Bayesian way which best explains the given data. Finally we discuss the advantages and pitfalls of using this method over state-of-the-art techniques and highlight the specific class of problems where this would be useful. ROBABILITY theory and fuzzy logic have been shown to be complementary [1] and various works have looked at the symbiotic integration of these two paradigms [2], [3] including the recently introduced concept of Z-numbers [4]. Historically fuzzy logic has been applied to problems involving imprecision in linguistic variables, while probability theory has been used for quantifying uncertainty in a wide range of disciplines. V arious generalisations and extensions of fuzzy sets have been proposed to incorporate uncertainty and vagueness which arise from multiple sources. For example, the type-2 fuzzy [5], [6] sets and type-n fuzzy sets [5] can include uncertainty while defining the membership functions themselves. Intuitionistic fuzzy sets [7] additionally introduce the degree of non-membership of an element to take into account that there might be some hesitation degree and the degree of membership and non-membership of an element might not always add to one. Non-stationary fuzzy sets [8] can model variation of opinion over time by defining a collection of type 1 fuzzy sets and an explicit relationship between them. Fuzzy multi-sets [9] generalise crisp sets where multiple occurrences of an element are permitted. Hesitant fuzzy sets [10] have been proposed from the motivation that the problem of assigning a degree of membership to an element is not because of a margin of error (like Atanassov's intuitionistic fuzzy sets) or a possibility distribution on possibility values (e.g. Formally these can be viewed as fuzzy multi-sets but with a different interpretation.

artificial intelligence, machine learning, membership function, (16 more...)

1610.09156

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report > Promising Solution (0.54)

Industry:

Banking & Finance > Financial Services (0.68)
Banking & Finance > Insurance (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ng, Yin Cheng, Chilinski, Pawel, Silva, Ricardo

Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages

Factorial Hidden Markov Models (FHMMs) are powerful models for sequential data but they do not scale well with long sequences. We propose a scalable inference and learning algorithm for FHMMs that draws on ideas from the stochastic variational inference, neural network and copula literatures. Unlike existing approaches, the proposed algorithm requires no message passing procedure among latent variables and can be distributed to a network of computers to speed up learning. Our experiments corroborate that the proposed algorithm does not introduce further approximation bias compared to the proven structured mean-field algorithm, and achieves better performance with long sequences and large FHMMs.

algorithm, artificial intelligence, machine learning, (17 more...)

1608.03817

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Niu, Gang, Plessis, Marthinus Christoffel du, Sakai, Tomoya, Ma, Yao, Sugiyama, Masashi

Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

In PU learning, a binary classifier is trained from positive (P) and unlabeled (U) data without negative (N) data. Although N data is missing, it sometimes outperforms PN learning (i.e., ordinary supervised learning). Hitherto, neither theoretical nor experimental analysis has been given to explain this phenomenon. In this paper, we theoretically compare PU (and NU) learning against PN learning based on the upper bounds on estimation errors. We find simple conditions when PU and NU learning are likely to outperform PN learning, and we prove that, in terms of the upper bounds, either PU or NU learning (depending on the class-prior probability and the sizes of P and N data) given infinite U data will improve on PN learning. Our theoretical findings well agree with the experimental results on artificial and benchmark data even when the experimental setup does not match the theoretical assumptions exactly.

artificial intelligence, machine learning, misclassification rate, (15 more...)

1603.0313

Country: Asia > Japan (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Krishnamurthy, Akshay, Agarwal, Alekh, Langford, John

PAC Reinforcement Learning with Rich Observations

We propose and study a new model for reinforcement learning with rich observations, generalizing contextual bandits to sequential decision making. These models require an agent to take actions based on observations (features) with the goal of achieving long-term performance competitive with a large set of policies. To avoid barriers to sample-efficient learning associated with large observation spaces and general POMDPs, we focus on problems that can be summarized by a small number of hidden states and have long-term rewards that are predictable by a reactive function class. In this setting, we design and analyze a new reinforcement learning algorithm, Least Squares Value Elimination by Exploration. We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space. Our result provides theoretical justification for reinforcement learning with function approximation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1602.02722

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

#artificialintelligenceOct-27-2016, 05:20:44 GMT

About Feature Scaling and Normalization

The result of standardization (or Z-score normalization) is that the features will be rescaled so that they'll have the properties of a standard normal distribution with Standardizing the features so that they are centered around 0 with a standard deviation of 1 is not only important if we are comparing measurements that have different units, but it is also a general requirement for many machine learning algorithms. Intuitively, we can think of gradient descent as a prominent example (an optimization algorithm often used in logistic regression, SVMs, perceptrons, neural networks etc.); with features being on different scales, certain weights may update faster than others since the feature values play a role in the weight updates Other intuitive examples include K-Nearest Neighbor algorithms and clustering algorithms that use, for example, Euclidean distance measures – in fact, tree-based classifier are probably the only classifiers where feature scaling doesn't make a difference. In fact, the only family of algorithms that I could think of being scale-invariant are tree-based methods. Let's take the general CART decision tree algorithm. Without going into much depth regarding information gain and impurity measures, we can think of the decision as "is feature x_i some_val?"

artificial intelligence, machine learning, standardization, (14 more...)

#artificialintelligence

Country:

North America > United States > California > Orange County > Irvine (0.05)
Europe > Italy > Liguria > Genoa (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

#artificialintelligenceOct-27-2016, 05:20:37 GMT

Deep Learning: Definition, Resources, Comparison with Machine Learning

Deep learning is sometimes referred to as the intersection between machine learning and artificial intelligence. It is about designing algorithms that can make robots intelligent, such a face recognition techniques used in drones to detect and target terrorists, or pattern recognition / computer vision algorithms to automatically pilot a plane, a train, a boat or a car. Many deep learning algorithms (clustering, pattern recognition, automated bidding, recommendation engine, and so on) -- even though they appear in new contexts such as IoT or machine to machine communication -- still rely on relatively old-fashioned techniques such as logistic regression, SVM, decision trees, K-NN, naive Bayes, Bayesian modeling, ensembles, random forests, signal processing, filtering, graph theory, gaming theory, and many others. Some are new, such as indexation algorithms to automate digital publishing, improve search engines, or create and manage large catalogs such as Amazon's product listing. As a result, many deep learning practitioners call themselves data scientist, computer scientist, statistician, or sometimes engineer.

artificial intelligence, deep learning, machine learning, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)