AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

SDEs for Minimax Optimization

Compagnoni, Enea Monzio, Orvieto, Antonio, Kersting, Hans, Proske, Frank Norbert, Lucchi, Aurelien

arXiv.org Artificial IntelligenceFeb-19-2024

Minimax optimization problems have attracted a lot of attention over the past few years, with applications ranging from economics to machine learning. While advanced optimization methods exist for such problems, characterizing their dynamics in stochastic scenarios remains notably challenging. In this paper, we pioneer the use of stochastic differential equations (SDEs) to analyze and compare Minimax optimizers. Our SDE models for Stochastic Gradient Descent-Ascent, Stochastic Extragradient, and Stochastic Hamiltonian Gradient Descent are provable approximations of their algorithmic counterparts, clearly showcasing the interplay between hyperparameters, implicit regularization, and implicit curvature-induced noise. This perspective also allows for a unified and simplified analysis strategy based on the principles of It\^o calculus. Finally, our approach facilitates the derivation of convergence conditions and closed-form solutions for the dynamics in simplified settings, unveiling further insights into the behavior of different optimizers.

seg, shgd, trajectory, (12 more...)

arXiv.org Artificial Intelligence

2402.12508

Country:

Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Partial Search in a Frozen Network is Enough to Find a Strong Lottery Ticket

Otsuka, Hikari, Chijiwa, Daiki, García-Arias, Ángel López, Okoshi, Yasuyuki, Kawamura, Kazushi, Van Chu, Thiem, Fujiki, Daichi, Takeuchi, Susumu, Motomura, Masato

arXiv.org Machine LearningFeb-19-2024

Randomly initialized dense networks contain subnetworks that achieve high accuracy without weight learning -- strong lottery tickets (SLTs). Recently, Gadhikar et al. (2023) demonstrated theoretically and experimentally that SLTs can also be found within a randomly pruned source network, thus reducing the SLT search space. However, this limits the search to SLTs that are even sparser than the source, leading to worse accuracy due to unintentionally high sparsity. This paper proposes a method that reduces the SLT search space by an arbitrary ratio that is independent of the desired SLT sparsity. A random subset of the initial weights is excluded from the search space by freezing it -- i.e., by either permanently pruning them or locking them as a fixed part of the SLT. Indeed, the SLT existence in such a reduced search space is theoretically guaranteed by our subset-sum approximation with randomly frozen variables. In addition to reducing search space, the random freezing pattern can also be exploited to reduce model size in inference. Furthermore, experimental results show that the proposed method finds SLTs with better accuracy and model size trade-off than the SLTs obtained from dense or randomly pruned source networks. In particular, the SLT found in a frozen graph neural network achieves higher accuracy than its weight trained counterpart while reducing model size by $40.3\times$.

search space, slt, source network, (14 more...)

arXiv.org Machine Learning

2402.14029

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Contests & Prizes (0.87)
Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Gambling (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Linear bandits with polylogarithmic minimax regret

Lumbreras, Josep, Tomamichel, Marco

arXiv.org Machine LearningFeb-19-2024

We study a noise model for linear stochastic bandits for which the subgaussian noise parameter vanishes linearly as we select actions on the unit sphere closer and closer to the unknown vector. We introduce an algorithm for this problem that exhibits a minimax regret scaling as $\log^3(T)$ in the time horizon $T$, in stark contrast the square root scaling of this regret for typical bandit algorithms. Our strategy, based on weighted least-squares estimation, achieves the eigenvalue relation $\lambda_{\min} ( V_t ) = \Omega (\sqrt{\lambda_{\max}(V_t ) })$ for the design matrix $V_t$ at each time step $t$ through geometrical arguments that are independent of the noise model and might be of independent interest. This allows us to tightly control the expected regret in each time step to be of the order $O(\frac1{t})$, leading to the logarithmic scaling of the cumulative regret.

bandit, eigenvalue, inequality, (15 more...)

arXiv.org Machine Learning

2402.12042

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Add feedback

Theory and Approximate Solvers for Branched Optimal Transport with Multiple Sources

Neural Information Processing SystemsFeb-18-2024, 05:19:45 GMT

Branched optimal transport (BOT) is a generalization of optimal transport in which transportation costs along an edge are subadditive. This subadditivity models an increase in transport efficiency when shipping mass along the same route, favoring branched transportation networks.

artificial intelligence, machine learning, optimization, (19 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Communications > Networks (0.68)

Add feedback

NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search ∗, Zeyang Zhang, Wenwu Zhu

Neural Information Processing SystemsFeb-18-2024, 04:14:57 GMT

Graph neural architecture search (GraphNAS) has recently aroused considerable attention in both academia and industry. However, two key challenges seriously hinder the further research of GraphNAS. First, since there is no consensus for the experimental setting, the empirical results in different research papers are often not comparable and even not reproducible, leading to unfair comparisons. Secondly, GraphNAS often needs extensive computations, which makes it highly inefficient and inaccessible to researchers without access to large-scale computation. To solve these challenges, we propose NAS-Bench-Graph, a tailored benchmark that supports unified, reproducible, and efficient evaluations for GraphNAS.

architecture, architecture search, international conference, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information

Neural Information Processing SystemsFeb-16-2024, 13:37:47 GMT

Epoch-Greedy has the following properties: No knowledge of a time horizon T is necessary. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. Here S is the complexity term in a sample complexity bound for standard supervised learning.

epoch-greedy algorithm, multi-armed bandit, side information, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.55)
Information Technology > Data Science > Data Mining > Big Data (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.40)

Add feedback

Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

Neural Information Processing SystemsFeb-16-2024, 11:49:55 GMT

This paper uses information-theoretic techniques to determine minimax rates for estimating nonparametric sparse additive regression models under high-dimensional scaling. The first term reflects the difficulty of performing \emph{subset selection} and is independent of the Hilbert space \Hilb; the second term \LowerRateSq is an \emph{\s-dimensional estimation} term, depending only on the low dimension \s but not the ambient dimension \pdim, that captures the difficulty of estimating a sum of \s univariate functions in the Hilbert space \Hilb . The minimax rates are compared with rates achieved by an \ell_1 -penalty based approach, it can be shown that a certain \ell_1 -based approach achieves the minimax optimal rate.

additive sparsity and smoothness, minimax rate, nonparametric regression, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Neural Information Processing SystemsFeb-16-2024, 08:08:03 GMT

Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) provided the first provably efficient method, the \emph{Isotron} algorithm, for learning SIMs and GLMs, under the assumption that the data is in fact generated under a GLM and under certain monotonicity and Lipschitz (bounded slope) constraints. However, to obtain provable performance, the method requires a fresh sample every iteration. In this paper, we provide algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient.

algorithm, isotonic regression, linear and single index model, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.63)

Add feedback

Selecting Diverse Features via Spectral Regularization

Neural Information Processing SystemsFeb-16-2024, 07:51:08 GMT

We study the problem of diverse feature selection in linear regression: selecting a small subset of diverse features that can predict a given objective. Diversity is useful for several reasons such as interpretability, robustness to noise, etc. We propose several spectral regularizers that capture a notion of diversity of features and show that these are all submodular set functions. These regularizers, when added to the objective function for linear regression, result in approximately submodular functions, which can then be maximized approximately by efficient greedy and local search algorithms, with provable guarantees. We compare our algorithms to traditional greedy and \ell_1 -regularization schemes and show that we obtain a more diverse set of features that result in the regression problem being stable under perturbations.

linear regression, selecting diverse feature, spectral regularization, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

IntSat: Integer Linear Programming by Conflict-Driven Constraint-Learning

Nieuwenhuis, Robert, Oliveras, Albert, Rodriguez-Carbonell, Enric

arXiv.org Artificial IntelligenceFeb-16-2024

State-of-the-art SAT solvers are nowadays able to handle huge real-world instances. The key to this success is the so-called Conflict-Driven Clause-Learning (CDCL) scheme, which encompasses a number of techniques that exploit the conflicts that are encountered during the search for a solution. In this article we extend these techniques to Integer Linear Programming (ILP), where variables may take general integer values instead of purely binary ones, constraints are more expressive than just propositional clauses, and there may be an objective function to optimise. We explain how these methods can be implemented efficiently, and discuss possible improvements. Our work is backed with a basic implementation that shows that, even in this far less mature stage, our techniques are already a useful complement to the state of the art in ILP solving.

conflict analysis, constraint, solver, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/10556788.2023.2246167

2402.15522

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(5 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback