AITopics | Schraudolph, Nicol N.

Plotting

Schraudolph, Nicol N.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning

Yu, Jin, Vishwanathan, S. V. N., Guenter, Simon, Schraudolph, Nicol N.

arXiv.org Machine LearningFeb-22-2010

We extend the well-known BFGS quasi-Newton method and its memory-limited variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: the local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We prove that under some technical conditions, the resulting subBFGS algorithm is globally convergent in objective function value. We apply its memory-limited variant (subLBFGS) to L_2-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings, we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that our line search can also be used to extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to L_1-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. An open source implementation of our algorithms is freely available.

artificial intelligence, line search, optimization problem, (18 more...)

arXiv.org Machine Learning

0804.3835

Country:

Oceania > Australia (0.28)
North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Efficient Exact Inference in Planar Ising Models

Schraudolph, Nicol N., Kamenetsky, Dmitry

Neural Information Processing SystemsDec-31-2009

We present polynomial-time algorithms for the exact computation of lowest- energy states, worst margin violators, partition functions, and marginals in binary undirected graphical models. Our approach provides an interesting alternative to the well-known graph cut paradigm in that it does not impose any submodularity constraints; instead we require planarity to establish a correspondence with perfect matchings in an expanded dual graph. Maximum-margin parameter estimation for a boundary detection task shows our approach to be efﬁcient and effective.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

Efficient Exact Inference in Planar Ising Models

Schraudolph, Nicol N., Kamenetsky, Dmitry

arXiv.org Machine LearningDec-17-2008

We give polynomial-time algorithms for the exact computation of lowest-energy (ground) states, worst margin violators, log partition functions, and marginal edge probabilities in certain binary undirected graphical models. Our approach provides an interesting alternative to the well-known graph cut paradigm in that it does not impose any submodularity constraints; instead we require planarity to establish a correspondence with perfect matchings (dimer coverings) in an expanded dual graph. We implement a unified framework while delegating complex but well-understood subproblems (planar embedding, maximum-weight perfect matching) to established algorithms for which efficient implementations are freely available. Unlike graph cut methods, we can perform penalized maximum-likelihood as well as maximum-margin parameter estimation in the associated conditional random fields (CRFs), and employ marginal posterior probabilities as well as maximum a posteriori (MAP) states for prediction. Maximum-margin CRF parameter estimation on image denoising and segmentation problems shows our approach to be efficient and effective. A C++ implementation is available from http://nic.schraudolph.org/isinf/

artificial intelligence, bayesian inference, graph, (16 more...)

arXiv.org Machine Learning

0810.4401

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Fast Computation of Graph Kernels

Borgwardt, Karsten M., Schraudolph, Nicol N., Vishwanathan, S.v.n.

Neural Information Processing SystemsDec-31-2007

Using extensions of linear algebra concepts to Reproducing Kernel Hilbert Spaces (RKHS), we define a unifying framework for random walk kernels on graphs.

artificial intelligence, graph, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.30)
Europe > Germany (0.28)
North America > United States > Massachusetts > Middlesex County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)

Add feedback

Fast Iterative Kernel PCA

Schraudolph, Nicol N., Günter, Simon, Vishwanathan, S.v.n.

Neural Information Processing SystemsDec-31-2007

We introduce two methods to improve convergence of the Kernel Hebbian Algorithm (KHA)for iterative kernel PCA. KHA has a scalar gain parameter which is either held constant or decreased as 1/t, leading to slow convergence. Our KHA/et algorithm accelerates KHA by incorporating the reciprocal of the current estimated eigenvalues as a gain vector. We then derive and apply Stochastic Meta-Descent (SMD) to KHA/et; this further speeds convergence by performing gain adaptation in RKHS. Experimental results for kernel PCA and spectral clustering of USPS digits as well as motion capture and image de-noising problems confirm that our methods converge substantially faster than conventional KHA.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.90)

Industry: Government > Regional Government (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation

Yu, Jin, Aberdeen, Douglas, Schraudolph, Nicol N.

Neural Information Processing SystemsDec-31-2006

The stochastic meta--descent (SMD) gain adaptation algorithm [3, 4] can considerably accelerate the convergence of stochastic gradient descent.

artificial intelligence, gradient, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.29)
Europe > United Kingdom > Scotland (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.33)

Add feedback

Online Independent Component Analysis with Local Learning Rate Adaptation

Schraudolph, Nicol N., Giannakopoulos, Xavier

Neural Information Processing SystemsDec-31-2000

Stochastic meta-descent (SMD) is a new technique for online adaptation oflocal learning rates in arbitrary twice-differentiable systems. Like matrix momentum it uses full second-order information while retaining O(n) computational complexity by exploiting the efficient computation of Hessian-vector products. Here we apply SMD to independent component analysis, and employ the resulting algorithmfor the blind separation of time-varying mixtures. By matching individual learning rates to the rate of change in each source signal's mixture coefficients, our technique is capable of simultaneously trackingsources that move at very different, a priori unknown speeds. 1 Introduction Independent component analysis (ICA) methods are typically run in batch mode in order to keep the stochasticity of the empirical gradient low. Often this is combined with a global learning rate annealing scheme that negotiates the tradeoff between fast convergence and good asymptotic performance.

algorithm, artificial intelligence, neural network, (13 more...)

Neural Information Processing Systems

Country:

Europe (0.94)
North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Government (0.69)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

Online Independent Component Analysis with Local Learning Rate Adaptation

Schraudolph, Nicol N., Giannakopoulos, Xavier

Neural Information Processing SystemsDec-31-2000

algorithm, artificial intelligence, neural network, (11 more...)

Neural Information Processing Systems

Country:

Europe (0.94)
North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Education (0.88)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.32)

Add feedback

Empirical Entropy Manipulation for Real-World Problems

Viola, Paul A., Schraudolph, Nicol N., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1996

No finite sample is sufficient to determine the density, and therefore the entropy, of a signal directly. Some assumption about either the functional form of the density or about its smoothness is necessary.

artificial intelligence, entropy, health & medicine, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry: Health & Medicine (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tempering Backpropagation Networks: Not All Weights are Created Equal

Schraudolph, Nicol N., Sejnowski, Terrence J.

Neural Information Processing SystemsDec-31-1996

Backpropagation learning algorithms typically collapse the network's structure into a single vector of weight parameters to be optimized. We suggest that their performance may be improved by utilizing the structural information instead of discarding it, and introduce a framework for ''tempering'' each weight accordingly. In the tempering model, activation and error signals are treated as approximately independent random variables. The characteristic scale of weight changes is then matched to that ofthe residuals, allowing structural properties such as a node's fan-in and fan-out to affect the local learning rate and backpropagated error. The model also permits calculation of an upper bound on the global learning rate for batch updates, which in turn leads to different update rules for bias vs. non-bias weights. This approach yields hitherto unparalleled performance on the family relations benchmark, a deep multi-layer network: for both batch learning with momentum and the delta-bar-delta algorithm, convergence at the optimal learning rate is sped up by more than an order of magnitude.

learning rate, neural network, survey article, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.65)

Add feedback