AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)

Neural Information Processing SystemsJan-17-2025, 17:14:53 GMT

Zeroth-Order Hard-Thresholding: Gradient Error vs. Expansivity

Hard-thresholding gradient descent is a dominant technique to solve this problem. However, first-order gradients of the objective function may be either unavailable or expensive to calculate in a lot of real-world problems, where zeroth-order (ZO) gradients could be a good surrogate. Unfortunately, whether ZO gradients can work with the hard-thresholding operator is still an unsolved problem.To solve this puzzle, in this paper, we focus on the \ell_0 constrained black-box stochastic optimization problems, and propose a new stochastic zeroth-order gradient hard-thresholding (SZOHT) algorithm with a general ZO gradient estimator powered by a novel random support sampling. We provide the convergence analysis of SZOHT under standard assumptions. Importantly, we reveal a conflict between the deviation of ZO estimators and the expansivity of the hard-thresholding operator, and provide a theoretical minimal value of the number of random directions in ZO gradients.

expansivity, gradient, zeroth-order hard-thresholding, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Neural Information Processing SystemsJan-17-2025, 14:19:34 GMT

TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training

Sparsely gated Mixture-of-Expert (MoE) has demonstrated its effectiveness in scaling up deep neural networks to an extreme scale. Despite that numerous efforts have been made to improve the performance of MoE from the model design or system optimization perspective, existing MoE dispatch patterns are still not able to fully exploit the underlying heterogeneous network environments. In this paper, we propose TA-MoE, a topology-aware routing strategy for large-scale MoE trainging, from a model-system co-design perspective, which can dynamically adjust the MoE dispatch pattern according to the network topology. Based on communication modeling, we abstract the dispatch problem into an optimization objective and obtain the approximate dispatch pattern under different topologies. On top of that, we design a topology-aware auxiliary loss, which can adaptively route the data to fit in the underlying topology without sacrificing the model accuracy.

dispatch pattern, scale mixture-of-expert training, ta-moe, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Neural Information Processing SystemsJan-17-2025, 14:18:03 GMT

Coordinate Linear Variance Reduction for Generalized Linear Programming

We study a class of generalized linear programs (GLP) in a large-scale setting, which includes simple, possibly nonsmooth convex regularizer and simple convex set constraints. By reformulating (GLP) as an equivalent convex-concave min-max problem, we show that the linear structure in the problem can be used to design an efficient, scalable first-order algorithm, to which we give the name Coordinate Linear Variance Reduction (CLVR; pronounced clever''). CLVR yields improved complexity results for (GLP) that depend on the max row norm of the linear constraint matrix in (GLP) rather than the spectral norm. When the regularization terms and constraints are separable, CLVR admits an efficient lazy update strategy that makes its complexity bounds scale with the number of nonzero elements of the linear constraint matrix in (GLP) rather than the matrix dimensions. On the other hand, for the special case of linear programs, by exploiting sharpness, we propose a restart scheme for CLVR to obtain empirical linear convergence.

coordinate linear variance reduction, generalized linear programming, glp, (3 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Neural Information Processing SystemsJan-17-2025, 12:53:36 GMT

Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors

Compressive phase retrieval is a popular variant of the standard compressive sensing problem in which the measurements only contain magnitude information. In this paper, motivated by recent advances in deep generative models, we provide recovery guarantees with near-optimal sample complexity for phase retrieval with generative priors. Gaussian measurements and an L -Lipschitz continuous generative model with bounded k -dimensional inputs, roughly O(k \log L) samples suffice to guarantee that any signal minimizing an amplitude-based empirical loss function is close to the true signal. Attaining this sample complexity with a practical algorithm remains a difficult challenge, and finding a good initialization for gradient-based methods has been observed to pose a major bottleneck. To partially address this, we further show that roughly O(k \log L) samples ensure sufficient closeness between the underlying signal and any {\em globally optimal} solution to an optimization problem designed for spectral initialization (though finding such a solution may still be challenging).

phase retrieval, sample-optimal compressive phase retrieval, spectral initialization method, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.61)

Neural Information Processing SystemsJan-17-2025, 03:32:09 GMT

Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization

Molecular optimization is a fundamental goal in the chemical sciences and is of central interest to drug and material design. In recent years, significant progress has been made in solving challenging problems across various aspects of computational molecular optimizations, emphasizing high validity, diversity, and, most recently, synthesizability. Despite this progress, many papers report results on trivial or self-designed tasks, bringing additional challenges to directly assessing the performance of new methods. Moreover, the sample efficiency of the optimization---the number of molecules evaluated by the oracle---is rarely discussed, despite being an essential consideration for realistic discovery applications.To fill this gap, we have created an open-source benchmark for practical molecular optimization, PMO, to facilitate the transparent and reproducible evaluation of algorithmic advances in molecular optimization. This paper thoroughly investigates the performance of 25 molecular design algorithms on 23 single-objective (scalar) optimization tasks with a particular focus on sample efficiency. Our results show that most state-of-the-art'' methods fail to outperform their predecessors under a limited oracle budget allowing 10K queries and that no existing algorithm can efficiently solve certain molecular optimization problems in this setting.

optimization, practical molecular optimization, sample efficiency matter, (1 more...)

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.58)

Dwivedula, Rohit, Modak, Sadanand, Akella, Aditya, Biswas, Joydeep, Kim, Daehyeok, Rossbach, Christopher J.

ConfigBot: Adaptive Resource Allocation for Robot Applications in Dynamic Environments

The growing use of autonomous mobile service robots (AMSRs) in dynamic environments requires flexible management of compute resources to optimize the performance of diverse tasks such as navigation, localization, perception, and so on. Current robot deployments, which oftentimes rely on static configurations (of the OS, applications, etc.) and system over-provisioning, fall short since they do not account for the tasks' performance variations resulting in poor system-wide behavior such as robot instability and/or inefficient resource use. This paper presents ConfigBot, a system designed to adaptively reconfigure AMSR applications to meet a predefined performance specification by leveraging runtime profiling and automated configuration tuning. Through experiments on a Boston Dynamics Spot robot equipped with NVIDIA AGX Orin, we demonstrate ConfigBot's efficacy in maintaining system stability and optimizing resource allocation across diverse scenarios. Our findings highlight the promise of tailored and dynamic configurations for robot deployments.

artificial intelligence, configuration, optimization problem, (17 more...)

2501.10513

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Efficient and Safe Trajectory Planning for Autonomous Agricultural Vehicle Headland Turning in Cluttered Orchard Environments

Wei, Peng, Peng, Chen, Lu, Wenwu, Zhu, Yuankai, Vougioukas, Stavros, Fei, Zhenghao, Ge, Zhikang

Autonomous agricultural vehicles (AAVs), including field robots and autonomous tractors, are becoming essential in modern farming by improving efficiency and reducing labor costs. A critical task in AAV operations is headland turning between crop rows. This task is challenging in orchards with limited headland space, irregular boundaries, operational constraints, and static obstacles. While traditional trajectory planning methods work well in arable farming, they often fail in cluttered orchard environments. This letter presents a novel trajectory planner that enhances the safety and efficiency of AAV headland maneuvers, leveraging advancements in autonomous driving. Our approach includes an efficient front-end algorithm and a high-performance back-end optimization. Applied to vehicles with various implements, it outperforms state-of-the-art methods in both standard and challenging orchard fields. This work bridges agricultural and autonomous driving technologies, facilitating a broader adoption of AAVs in complex orchards.

artificial intelligence, optimization problem, trajectory, (17 more...)

doi: 10.1109/LRA.2025.3534056

2501.10636

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > California > Yolo County > Davis (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Transportation > Ground > Road (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Topology-Driven Attribute Recovery for Attribute Missing Graph Learning in Social Internet of Things

Li, Mengran, Chen, Junzhou, Yu, Chenyun, Jiang, Guanying, Zhang, Ronghui, Shen, Yanming, Song, Houbing Herbert

With the advancement of information technology, the Social Internet of Things (SIoT) has fostered the integration of physical devices and social networks, deepening the study of complex interaction patterns. Text Attribute Graphs (TAGs) capture both topological structures and semantic attributes, enhancing the analysis of complex interactions within the SIoT. However, existing graph learning methods are typically designed for complete attributed graphs, and the common issue of missing attributes in Attribute Missing Graphs (AMGs) increases the difficulty of analysis tasks. To address this, we propose the Topology-Driven Attribute Recovery (TDAR) framework, which leverages topological data for AMG learning. TDAR introduces an improved pre-filling method for initial attribute recovery using native graph topology. Additionally, it dynamically adjusts propagation weights and incorporates homogeneity strategies within the embedding space to suit AMGs' unique topological structures, effectively reducing noise during information propagation. Extensive experiments on public datasets demonstrate that TDAR significantly outperforms state-of-the-art methods in attribute reconstruction and downstream tasks, offering a robust solution to the challenges posed by AMGs. The code is available at https://github.com/limengran98/TDAR.

artificial intelligence, machine learning, node, (19 more...)

2501.10151

Country:

North America > United States > Maryland > Baltimore (0.14)
North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
(8 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Government > Regional Government (0.69)
Information Technology > Smart Houses & Appliances (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Moshtaghifar, Mohammad, Rodomanov, Anton, Vankov, Daniil, Stich, Sebastian

DADA: Dual Averaging with Distance Adaptation

We present a novel universal gradient method for solving convex optimization problems. Our algorithm -- Dual Averaging with Distance Adaptation (DADA) -- is based on the classical scheme of dual averaging and dynamically adjusts its coefficients based on observed gradients and the distance between iterates and the starting point, eliminating the need for problem-specific parameters. DADA is a universal algorithm that simultaneously works for a broad spectrum of problem classes, provided the local growth of the objective function around its minimizer can be bounded. Particular examples of such problem classes are nonsmooth Lipschitz functions, Lipschitz-smooth functions, H\"older-smooth functions, functions with high-order Lipschitz derivative, quasi-self-concordant functions, and $(L_0,L_1)$-smooth functions. Crucially, DADA is applicable to both unconstrained and constrained problems, even when the domain is unbounded, without requiring prior knowledge of the number of iterations or desired accuracy.

artificial intelligence, machine learning, theorem 2, (17 more...)

2501.10258

Country: North America > United States > Arizona (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)