AITopics

2010.0125

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.70)

Industry:

Information Technology > Security & Privacy (0.72)
Transportation > Air (0.63)
Government > Military (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Kechaou, Marwa, Hérault, Romain, Alaya, Mokhtar Z., Gasso, Gilles

Open Set Domain Adaptation using Optimal Transport

arXiv.org Machine LearningOct-2-2020

We present a 2-step optimal transport approach that performs a mapping from a source distribution to a target distribution. Here, the target has the particularity to present new classes not present in the source domain. The first step of the approach aims at rejecting the samples issued from these new classes using an optimal transport plan. The second step solves the target (class ratio) shift still as an optimal transport problem. We develop a dual approach to solve the optimization problem involved at each step and we prove that our results outperform recent state-of-the-art performances. We further apply the approach to the setting where the source and target distributions present both a label-shift and an increasing covariate (features) shift to show its robustness.

artificial intelligence, machine learning, rej, (13 more...)

2010.01045

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia (0.04)

Genre:

Workflow (0.86)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Zhang, Guodong, Bao, Xuchan, Lessard, Laurent, Grosse, Roger

A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints

arXiv.org Machine LearningOct-2-2020

The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framework, we recover the existing bound for the gradient method~(GD), derive sharper bounds for the proximal point method~(PPM) and optimistic gradient method~(OG), and provide \emph{for the first time} a global convergence rate for the negative momentum method~(NM) with an iteration complexity $\bigo(\kappa^{1.5})$, which matches its known lower bound. In addition, for time-varying systems, we prove that the gradient method with optimal step size achieves the fastest provable worst-case convergence rate with quadratic Lyapunov functions. Finally, we further extend our analysis to stochastic games and study the impact of multiplicative noise on different algorithms. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch (in contrast with the stochastic strongly-convex optimization setting, where such acceleration has been demonstrated). However, we exhibit an algorithm which achieves acceleration with two gradient queries per batch.

algorithm, artificial intelligence, machine learning, (16 more...)

2009.11359

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry: Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Steinhoff, Vera, Kerschke, Pascal, Aspar, Pelin, Trautmann, Heike, Grimme, Christian

Multiobjectivization of Local Search: Single-Objective Optimization Benefits From Multi-Objective Gradient Descent

arXiv.org Artificial IntelligenceOct-2-2020

Optimization is essentially everywhere and most real-world problems are of nonlinear and multimodal nature, i.e., there may exist multiple local optima that become traps for local search [23]. That is, classical local search based on gradient descent will get stuck in local optima unless restart mechanisms or search space exploration methods prevent premature convergence. Much effort has been put into this issue. Early attempts tried to make local search more flexible, e.g., by adding search points or spanning simplex structures, to discover patterns in search space and allow non-derivative descent to the optimum [20]. However, local search cannot solve these problems in general. Thus, later approaches [1] combine originally one-dimensional global search mechanisms like the STEP global search [30] and a local interpolation technique proposed by Brent [3] for the multivariate case. Others combine established stochastic global search mechanisms based on clustering [24] with newer elements of global optimizers [29] to gain quality improvements of solutions and to avoid finding only local optima [22].

artificial intelligence, machine learning, so-mogsa, (15 more...)

2010.01004

Country:

Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.05)
North America > United States (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)

arXiv.org Artificial IntelligenceOct-2-2020

BOSS: Bayesian Optimization over String Spaces

Moss, Henry B., Beck, Daniel, Gonzalez, Javier, Leslie, David S., Rayson, Paul

This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops. Recent applications of BO over strings have been hindered by the need to map inputs into a smooth and unconstrained latent space. Learning this projection is computationally and data-intensive. Our approach instead builds a powerful Gaussian process surrogate model based on string kernels, naturally supporting variable length inputs, and performs efficient acquisition function maximization for spaces with syntactical constraints. Experiments demonstrate considerably improved optimization over existing approaches across a broad range of constraints, including the popular setting where syntax is governed by a context-free grammar.

evolutionary algorithm, machine learning, natural language, (20 more...)

2010.00979

Country:

Europe > Austria > Vienna (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Chami, Ines, Gu, Albert, Chatziafratis, Vaggos, Ré, Christopher

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

arXiv.org Machine LearningOct-1-2020

Similarity-based Hierarchical Clustering (HC) is a classical unsupervised machine learning algorithm that has traditionally been solved with heuristic algorithms like Average-Linkage. Recently, Dasgupta reframed HC as a discrete optimization problem by introducing a global cost function measuring the quality of a given tree. In this work, we provide the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees. The key idea of our method, HypHC, is showing a direct correspondence from discrete trees to continuous representations (via the hyperbolic embeddings of their leaf nodes) and back (via a decoding algorithm that maps leaf embeddings to a dendrogram), allowing us to search the space of discrete binary trees with continuous optimization. Building on analogies between trees and hyperbolic space, we derive a continuous analogue for the notion of lowest common ancestor, which leads to a continuous relaxation of Dasgupta's discrete objective. We can show that after decoding, the global minimizer of our continuous relaxation yields a discrete tree with a (1 + epsilon)-factor approximation for Dasgupta's optimal tree, where epsilon can be made arbitrarily small and controls optimization challenges. We experimentally evaluate HypHC on a variety of HC benchmarks and find that even approximate solutions found with gradient descent have superior clustering quality than agglomerative heuristics or other gradient based algorithms. Finally, we highlight the flexibility of HypHC using end-to-end training in a downstream classification task.

artificial intelligence, machine learning, optimization problem, (16 more...)

2010.00402

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Information Technology (0.92)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Laguel, Yassine, Malick, Jérôme, Harchaoui, Zaid

First-order Optimization for Superquantile-based Supervised Learning

arXiv.org Machine LearningOct-1-2020

Classical supervised learning via empirical risk (or negative log-likelihood) minimization hinges upon the assumption that the testing distribution coincides with the training distribution. This assumption can be challenged in modern applications of machine learning in which learning machines may operate at prediction time with testing data whose distribution departs from the one of the training data. We revisit the superquantile regression method by proposing a first-order optimization algorithm to minimize a superquantile-based learning objective. The proposed algorithm is based on smoothing the superquantile function by infimal convolution. Promising numerical results illustrate the interest of the approach towards safer supervised learning.

artificial intelligence, inductive learning, machine learning, (15 more...)

2009.14575

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Arizona (0.04)
Europe > Finland (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Machine LearningOct-1-2020

Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization

Ma, Xuezhe

Importantly, the update and storage of the diagonal approximation of Hessian is as efficient as adaptive first-order optimization methods with linear complexity for both time and memory. To handle nonconvexity, we replace the Hessian with its rectified absolute value, which is guaranteed to be positive-definite. Nonconvex stochastic optimization is of core practical importance in many fields of machine learning, in particular for training deep neural networks (DNNs). First-order gradient-based optimization algorithms, conceptually attractive due to their linear efficiency on both the time and memory complexity, have led to tremendous progress and impressive successes. However, one disadvantage of SGD is that the gradients in different directions are scaled uniformly, resulting in limited convergence speed and sensitive choice of the learning rate, and thus has spawned a lot of recent interest in accelerating SGD from the algorithmic and practical perspectives. Recently, many adaptive first-order optimization methods have been proposed to achieve rapid training progress with element-wise scaled learning rates, and we can only mention a few here due to space limits. In their pioneering work, Duchi et al. (2011) proposed AdaGrad, which scales the gradient by the square root of the accumulative square gradients from the first iteration.

artificial intelligence, machine learning, pollo, (17 more...)

2009.13586

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > Berlin (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kenekayoro, Patrick, Fawei, Biralatei

Meta-Heuristic Solutions to a Student Grouping Optimization Problem faced in Higher Education Institutions

arXiv.org Artificial IntelligenceOct-1-2020

Combinatorial problems which have been proven to be NP-hard are faced in Higher Education Institutions and researches have extensively investigated some of the well-known combinatorial problems such as the timetabling and student project allocation problems. However, NP-hard problems faced in Higher Education Institutions are not only confined to these categories of combinatorial problems. The majority of NP-hard problems faced in institutions involve grouping students and/or resources, albeit with each problem having its own unique set of constraints. Thus, it can be argued that techniques to solve NP-hard problems in Higher Education Institutions can be transferred across the different problem categories. As no method is guaranteed to outperform all others in all problems, it is necessary to investigate heuristic techniques for solving lesser-known problems in order to guide stakeholders or software developers to the most appropriate algorithm for each unique class of NP-hard problems faced in Higher Education Institutions. To this end, this study described an optimization problem faced in a real university that involved grouping students for the presentation of semester results. Ordering based heuristics, genetic algorithm and the ant colony optimization algorithm implemented in Python programming language were used to find feasible solutions to this problem, with the ant colony optimization algorithm performing better or equal in 75% of the test instances and the genetic algorithm producing better or equal results in 38% of the test instances.

artificial intelligence, evolutionary algorithm, machine learning, (13 more...)

2010.00499

Country:

Atlantic Ocean > South Atlantic Ocean > Gulf of Guinea > Niger Delta (0.04)
Africa > Nigeria > Niger Delta (0.04)
Africa > Nigeria > Bayelsa State (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Research Report (0.84)

Industry: Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Mastalli, Carlos, Merkt, Wolfgang, Marti-Saumell, Josep, Sola, Joan, Mansard, Nicolas, Vijayakumar, Sethu

A Direct-Indirect Hybridization Approach to Control-Limited DDP

arXiv.org Artificial IntelligenceOct-1-2020

Optimal control is a widely used tool for synthesizing motions and controls for user-defined tasks under physical constraints. A common approach is to formulate it using direct multiple-shooting and then to use off-the-shelf nonlinear programming solvers that can easily handle arbitrary constraints on the controls and states. However, these methods are not fast enough for many robotics applications such as real-time humanoid motor control. Exploiting the sparse structure of optimal control problem, such as in Differential DynamicProgramming (DDP), has proven to significantly boost the computational efficiency, and recent works have been focused on handling arbitrary constraints. Despite that, DDP has been associated with poor numerical convergence, particularly when considering long time horizons. One of the main reasons is due to system instabilities and poor warm-starting (only controls). This paper presents control-limited Feasibility-driven DDP (Box-FDDP), a solver that incorporates a direct-indirect hybridization of the control-limited DDP algorithm. Concretely, the forward and backward passes handle feasibility and control limits. We showcase the impact and importance of our method on a set of challenging optimal control problems against the Box-DDP and squashing-function approach.

artificial intelligence, box-fddp, optimization problem, (16 more...)

2010.00411

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(9 more...)

Genre: Research Report (0.82)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)