AITopics | Giesen, Joachim

Collaborating Authors

Giesen, Joachim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why Capsule Neural Networks Do Not Scale: Challenging the Dynamic Parse-Tree Assumption

Mitterreiter, Matthias, Koch, Marcel, Giesen, Joachim, Laue, Sören

arXiv.org Artificial IntelligenceJan-4-2023

Capsule neural networks replace simple, scalar-valued neurons with vector-valued capsules. They are motivated by the pattern recognition system in the human brain, where complex objects are decomposed into a hierarchy of simpler object parts. Such a hierarchy is referred to as a parse-tree. Conceptually, capsule neural networks have been defined to realize such parse-trees. The capsule neural network (CapsNet), by Sabour, Frosst, and Hinton, is the first actual implementation of the conceptual idea of capsule neural networks. CapsNets achieved state-of-the-art performance on simple image recognition tasks with fewer parameters and greater robustness to affine transformations than comparable approaches. This sparked extensive follow-up research. However, despite major efforts, no work was able to scale the CapsNet architecture to more reasonable-sized datasets. Here, we provide a reason for this failure and argue that it is most likely not possible to scale CapsNets beyond toy examples. In particular, we show that the concept of a parse-tree, the main idea behind capsule neuronal networks, is not present in CapsNets. We also show theoretically and experimentally that CapsNets suffer from a vanishing gradient problem that results in the starvation of many capsules during training.

artificial intelligence, dynamic parse-tree assumption, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2301.01583

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

GENO -- GENeric Optimization for Classical Machine Learning

Laue, Sören, Mitterreiter, Matthias, Giesen, Joachim

arXiv.org Machine LearningMay-31-2019

Although optimization is the longstanding algorithmic backbone of machine learning, new models still require the time-consuming implementation of new solvers. As a result, there are thousands of implementations of optimization algorithms for machine learning problems. A natural question is, if it is always necessary to implement a new solver, or if there is one algorithm that is sufficient for most models. Common belief suggests that such a one-algorithm-fits-all approach cannot work, because this algorithm cannot exploit model specific structure and thus cannot be efficient and robust on a wide variety of problems. Here, we challenge this common belief. We have designed and implemented the optimization framework GENO (GENeric Optimization) that combines a modeling language with a generic solver. GENO generates a solver from the declarative specification of an optimization problem class. The framework is flexible enough to encompass most of the classical machine learning problems. We show on a wide variety of classical but also some recently suggested problems that the automatically generated solvers are (1) as efficient as well-engineered specialized solvers, (2) more efficient by a decent margin than recent state-of-the-art solvers, and (3) orders of magnitude more efficient than classical modeling language plus solver approaches.

deep learning, neural network, solver, (22 more...)

arXiv.org Machine Learning

1905.13587

Country:

Europe (0.28)
North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.84)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Ising Models with Latent Conditional Gaussian Variables

Nussbaum, Frank, Giesen, Joachim

arXiv.org Machine LearningJan-28-2019

Ising models describe the joint probability distribution of a vector of binary feature variables. Typically, not all the variables interact with each other and one is interested in learning the presumably sparse network structure of the interacting variables. However, in the presence of latent variables, the conventional method of learning a sparse model might fail. This is because the latent variables induce indirect interactions of the observed variables. In the case of only a few latent conditional Gaussian variables these spurious interactions contribute an additional low-rank component to the interaction parameters of the observed Ising model. Therefore, we propose to learn a sparse + low-rank decomposition of the parameters of an Ising model using a convex regularized likelihood problem. We show that the same problem can be obtained as the dual of a maximum-entropy problem with a new type of relaxation, where the sample means collectively need to match the expected values only up to a given tolerance. The solution to the convex optimization problem has consistency properties in the high-dimensional setting, where the number of observed binary variables and the number of latent conditional Gaussian variables are allowed to grow with the number of training samples.

artificial intelligence, inequality follow, optimization problem, (18 more...)

arXiv.org Machine Learning

1901.09712

Country: Europe (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Computing Higher Order Derivatives of Matrix and Tensor Expressions

Laue, Soeren, Mitterreiter, Matthias, Giesen, Joachim

Neural Information Processing SystemsDec-31-2018

Optimization is an integral part of most machine learning systems and most numerical optimization schemes rely on the computation of derivatives. Therefore, frameworks for computing derivatives are an active area of machine learning research. Surprisingly, as of yet, no existing framework is capable of computing higher order matrix and tensor derivatives directly. Here, we close this fundamental gap and present an algorithmic framework for computing matrix and tensor derivatives that extends seamlessly to higher order derivatives. The framework can be used for symbolic as well as for forward and reverse mode automatic differentiation. Experiments show a speedup of up to two orders of magnitude over state-of-the-art frameworks when evaluating higher order derivatives on CPUs and a speedup of about three orders of magnitude on GPUs.

artificial intelligence, expression, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Computing Higher Order Derivatives of Matrix and Tensor Expressions

Laue, Soeren, Mitterreiter, Matthias, Giesen, Joachim

Neural Information Processing SystemsDec-31-2018

Optimization is an integral part of most machine learning systems and most numerical optimization schemes rely on the computation of derivatives. Therefore, frameworks for computing derivatives are an active area of machine learning research. Surprisingly, as of yet, no existing framework is capable of computing higher order matrix and tensor derivatives directly. Here, we close this fundamental gap and present an algorithmic framework for computing matrix and tensor derivatives that extends seamlessly to higher order derivatives. The framework can be used for symbolic as well as for forward and reverse mode automatic differentiation. Experiments show a speedup between one and four orders of magnitude over state-of-the-art frameworks when evaluating higher order derivatives.

deep learning, expression, neural network, (20 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Distributed Convex Optimization with Many Convex Constraints

Giesen, Joachim, Laue, Sören

arXiv.org Machine LearningOct-7-2016

We address the problem of solving convex optimization problems with many convex constraints in a distributed setting. Our approach is based on an extension of the alternating direction method of multipliers (ADMM) that recently gained a lot of attention in the Big Data context. Although it has been invented decades ago, ADMM so far can be applied only to unconstrained problems and problems with linear equality or inequality constraints. Our extension can handle arbitrary inequality constraints directly. It combines the ability of ADMM to solve convex optimization problems in a distributed setting with the ability of the Augmented Lagrangian method to solve constrained optimization problems, and as we show, it inherits the convergence guarantees of ADMM and the Augmented Lagrangian method.

artificial intelligence, constraint, optimization problem, (17 more...)

arXiv.org Machine Learning

1610.02967

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Approximating Concavely Parameterized Optimization Problems

Giesen, Joachim, Mueller, Jens, Laue, Soeren, Swiercy, Sascha

Neural Information Processing SystemsDec-31-2012

We consider an abstract class of optimization problems that are parameterized concavely in a single parameter, and show that the solution path along the parameter can always be approximated with accuracy $\varepsilon >0$ by a set of size $O(1/\sqrt{\varepsilon})$. A lower bound of size $\Omega (1/\sqrt{\varepsilon})$ shows that the upper bound is tight up to a constant factor. We also devise an algorithm that calls a step-size oracle and computes an approximate path of size $O(1/\sqrt{\varepsilon})$. Finally, we provide an implementation of the oracle for soft-margin support vector machines, and a parameterized semi-definite program for matrix completion.

artificial intelligence, complexity, optimization problem, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

A Combinatorial Algorithm to Compute Regularization Paths

Gärtner, Bernd, Giesen, Joachim, Jaggi, Martin, Welsch, Torsten

arXiv.org Artificial IntelligenceMar-27-2009

For a wide variety of regularization methods, algorithms computing the entire solution path have been developed recently. Solution path algorithms do not only compute the solution for one particular value of the regularization parameter but the entire path of solutions, making the selection of an optimal parameter much easier. Most of the currently used algorithms are not robust in the sense that they cannot deal with general or degenerate input. Here we present a new robust, generic method for parametric quadratic programming. Our algorithm directly applies to nearly all machine learning applications, where so far every application required its own different algorithm. We illustrate the usefulness of our method by applying it to a very low rank problem which could not be solved by existing path tracking methods, namely to compute part-worth values in choice based conjoint analysis, a popular technique from market research to estimate consumers preferences on a class of parameterized options.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

0903.4856

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.51)

Add feedback

Kernel Methods for Implicit Surface Modeling

Giesen, Joachim, Spalinger, Simon, Schölkopf, Bernhard

Neural Information Processing SystemsDec-31-2005

We describe methods for computing an implicit model of a hypersurface that is given only by a finite sampling. The methods work by mapping the sample points into a reproducing kernel Hilbert space and then determining regionsin terms of hyperplanes.

artificial intelligence, machine learning, sample point, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.29)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

Add feedback