AITopics

We present an algorithm that induces a class of models with thin junction trees--models that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.

algorithm, junction tree, thin junction tree, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Andrieu, Christophe, Freitas, Nando D., Doucet, Arnaud

Rao-Blackwellised Particle Filtering via Data Augmentation

SMC is often referred to as particle filtering (PF) in the context of computing filtering distributions for statistical inference and learning. It is known that the performance of PF often deteriorates in high-dimensional state spaces. In the past, we have shown that if a model admits partial analytical tractability, it is possible to combine PF with exact algorithms (Kalman filters, HMM filters, junction tree algorithm) to obtain efficient high dimensional filters (Doucet, de Freitas, Murphy and Russell 2000, Doucet, Godsill and Andrieu 2000). In particular, we exploited a marginalisation technique known as Rao-Blackwellisation (RB). Here, we attack a more complex model that does not admit immediate analytical tractability.

algorithm, doucet, particle, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Generalization Performance of Some Learning Problems in Hilbert Functional Spaces

Zhang, T.

We investigate the generalization performance of some learning problems in Hilbert functional Spaces. We introduce a notion of convergence of the estimated functional predictor to the best underlying predictor, and obtain an estimate on the rate of the convergence. This estimate allows us to derive generalization bounds on some learning formulations.

inequality, probability, theorem 3, (14 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Focused Education > Special Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Fast Parameter Estimation Using Green's Functions

Wong, K., Li, F.

It is well known that correct choices of hyperparameters in classification and regression tasks can optimize the complexity of the data model, and hence achieve the best generalization [1]. In recent years various methods have been proposed to estimate the optimal hyperparameters in different contexts, such as neural networks [2], support vector machines [3, 4, 5] and Gaussian processes [5]. Most of these methods are inspired by the technique of cross-validation or its variant, leave-one-out validation. While the leave-one-out procedure gives an almost unbiased estimate of the generalization error, it is nevertheless very tedious. Many of the mentioned attempts aimed at approximating this tedious procedure without really having to sweat through it.

cavity method, generalization error, green, (13 more...)

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Tanaka, Toshiyuki, Ikeda, Shiro, Amari, Shun-ichi

Information-Geometrical Significance of Sparsity in Gallager Codes

We report a result of perturbation analysis on decoding error of the belief propagation decoder for Gallager codes. The analysis is based on information geometry, and it shows that the principal term of decoding error at equilibrium comes from the m-embedding curvature of the log-linear submanifold spanned by the estimated pseudoposteriors, one for the full marginal, and K for partial posteriors, each of which takes a single check into account, where K is the number of checks in the Gallager code. It is then shown that the principal error term vanishes when the parity-check matrix of the code is so sparse that there are no two columns with overlap greater than 1.

belief propagation decoder, gallager code, parity-check matrix, (9 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Gaussian Process Regression with Mismatched Models

Sollich, Peter

The behaviour is much richer than for the matched case, and could guide the choice of (student) priors in real-world applications of GP regression; RBF students, for example, run the risk of very slow logarithmic decay of the learning curve if the target (teacher) is less smooth than assumed. An important issue for future work-some of which is in progress-is to analyse to which extent hyperparameter tuning (e.g. via evidence maximization) can make GP learning robust against some forms of model mismatch, e.g. a misspecified functional form of the covariance function. One would like to know, for example, whether a data-dependent adjustment of the lengthscale of an RBF covariance function would be sufficient to avoid the logarithmically slow learning of rough target functions.

covariance function, eigenvalue, student, (13 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Shawe-Taylor, John, Cristianini, Nello, Kandola, Jaz S.

On the Concentration of Spectral Properties

Note that lEs is the expectation operator under the selection of the sample.

eigenvalue, eigenvector, subspace, (16 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.94)

Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

Schmitt, M.

Recurrent neural networks of analog units are computers for realvalued functions. We study the time complexity of real computation in general recurrent neural networks. These have sigmoidal, linear, and product units of unlimited order as nodes and no restrictions on the weights. For networks operating in discrete time, we exhibit a family of functions with arbitrarily high complexity, and we derive almost tight bounds on the time required to compute these functions. Thus, evidence is given of the computational limitations that time-bounded analog recurrent neural networks are subject to. 1 Introduction Analog recurrent neural networks are known to have computational capabilities that exceed those of classical Turing machines (see, e.g., Siegelmann and Sontag, 1995; Kilian and Siegelmann, 1996; Siegelmann, 1999).

neural network, node, recurrent neural network, (14 more...)

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Rattray, Magnus, Basalyga, Gleb

Scaling Laws and Local Minima in Hebbian ICA

We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background.

algorithm, ica algorithm, initial condition, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Rätsch, Gunnar, Mika, Sebastian, Warmuth, Manfred K. K.

On the Convergence of Leveraging

We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the Least-Square- Boost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly combined. We show that these methods are related to the Gauss-Southwell method known from numerical optimization and state non-asymptotical convergence results for all these methods. Our analysis includes -norm regularized cost functions leading to a clean and general way to regularize ensemble learning.

algorithm, hypothesis, loss function, (11 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > New Finding (0.37)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)