AITopics

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Bousquet, Olivier, Herrmann, Daniel

On the Complexity of Learning the Kernel Matrix

We investigate data based procedures for selecting the kernel when learning withSupport Vector Machines. We provide generalization error bounds by estimating the Rademacher complexities of the corresponding function classes. In particular we obtain a complexity bound for function classes induced by kernels with given eigenvectors, i.e., we allow to vary the spectrum and keep the eigenvectors fix. This bound is only a logarithmic factorbigger than the complexity of the function class induced by a single kernel. However, optimizing the margin over such classes leads to overfitting. We thus propose a suitable way of constraining the class. We use an efficient algorithm to solve the resulting optimization problem, present preliminary experimental results, and compare them to an alignment-based approach.

artificial intelligence, machine learning, optimization problem, (18 more...)

Country: Europe > Germany (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Watanabe, Sumio, Amari, Shun-ichi

The Effect of Singularities in a Learning Machine when the True Parameters Do Not Lie on such Singularities

A lot of learning machines with hidden variables used in information sciencehave singularities in their parameter spaces. At singularities, the Fisher information matrix becomes degenerate, resulting that the learning theory of regular statistical models does not hold. Recently, it was proven that, if the true parameter is contained in singularities, then the coefficient of the Bayes generalization erroris equal to the pole of the zeta function of the Kullback information.

artificial intelligence, machine learning, singularity, (14 more...)

Country: Asia > Japan > Honshū > Kantō (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Scott, Clayton, Nowak, Robert

Dyadic Classification Trees via Structural Risk Minimization

Classification trees are one of the most popular types of classifiers, with ease of implementation and interpretation being among their attractive features. Despite the widespread use of classification trees, theoretical analysis of their performance is scarce. In this paper, we show that a new family of classification trees, called dyadic classification trees (DCTs), are near optimal (in a minimax sense) for a very broad range of classification problems.This demonstrates that other schemes (e.g., neural networks, support vector machines) cannot perform significantly better than DCTs in many cases. We also show that this near optimal performance isattained with linear (in the number of training data) complexity growing and pruning algorithms. Moreover, the performance of DCTs on benchmark datasets compares favorably to that of standard CART, which is generally more computationally intensive and which does not possess similar near optimality properties. Our analysis stems from theoretical resultson structural risk minimization, on which the pruning rule for DCTs is based.

artificial intelligence, classification tree, machine learning, (15 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom > England (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Ortiz, Luis E., McAllester, David A.

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

This paper gives distribution-free concentration inequalities for the missing massand the error rate of histogram rules. Negative association methods canbe used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett'sinequality, or McDiarmid's theorem.

artificial intelligence, inequality, machine learning, (17 more...)

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy

Heskes, Tom

We extend recent work on the connection between loopy belief propagation and the Bethe free energy. Constrained minimization of the Bethe free energy can be turned into an unconstrained saddle-point problem. Both converging double-loop algorithms and standard loopy belief propagation can be interpreted asattempts to solve this saddle-point problem. Stability analysis then leads us to conclude that stable fixed points of loopy belief propagation must be (local) minima of the Bethe free energy. Perhaps surprisingly, the converse need not be the case: minima can be unstable fixed points. We illustrate this with an example and discuss implications.

algorithm, artificial intelligence, belief revision, (14 more...)

Country: Europe > Netherlands (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)

Slonim, Noam, Weiss, Yair

Maximum Likelihood and the Information Bottleneck

The information bottleneck (IB) method is an information-theoretic formulation for clustering problems.

artificial intelligence, bayesian inference, machine learning, (14 more...)

Country: Asia > Middle East (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Malzahn, Dörthe, Opper, Manfred

A Statistical Mechanics Approach to Approximate Analytical Bootstrap Averages

We apply the replica method of Statistical Physics combined with a variational methodto the approximate analytical computation of bootstrap averages for estimating the generalization error. We demonstrate our approach onregression with Gaussian processes and compare our results with averages obtained by Monte-Carlo sampling.

approximation, artificial intelligence, machine learning, (15 more...)

Country:

Europe > United Kingdom (0.28)
Europe > Denmark > Capital Region (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Data-Dependent Bounds for Bayesian Mixture Methods

Meir, Ron, Zhang, Tong

We consider Bayesian mixture approaches, where a predictor is constructed by forming a weighted average of hypotheses from some space of functions. While such procedures are known to lead to optimal predictors in several cases, where sufficiently accurate prior information is available, it has not been clear how they perform when some of the prior assumptions are violated. In this paper we establish data-dependent bounds for such procedures, extending previous randomized approaches such as the Gibbs algorithm to a fully Bayesian setting. The finite-sample guarantees established in this work enable the utilization of Bayesian mixture approaches in agnostic settings, where the usual assumptions of the Bayesian paradigm fail to hold. Moreover, the bounds derived can be directly applied to non-Bayesian mixture approaches such as Bagging and Boosting.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: North America > United States (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)

Sahani, Maneesh, Linden, Jennifer F.

Evidence Optimization Techniques for Estimating Stimulus-Response Functions

An essential step in understanding the function of sensory nervous systems isto characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. Oneincreasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses ofone or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal firing. The estimation techniques usually employed, such as Wiener filtering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters thatcontrol the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.