AITopics

doi: 10.1109/CVPR52769.2024.01476

2506.14263

Country: North America > United States (0.27)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

arXiv.org Artificial IntelligenceJun-18-2025

Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning

Klissarov, Martin, Bagaria, Akhil, Luo, Ziyan, Konidaris, George, Precup, Doina, Machado, Marlos C.

Developing agents capable of exploring, planning and learning in complex open-ended environments is a grand challenge in artificial intelligence (AI). Hierarchical reinforcement learning (HRL) offers a promising solution to this challenge by discovering and exploiting the temporal structure within a stream of experience. The strong appeal of the HRL framework has led to a rich and diverse body of literature attempting to discover a useful structure. However, it is still not clear how one might define what constitutes good structure in the first place, or the kind of problems in which identifying it may be helpful. This work aims to identify the benefits of HRL from the perspective of the fundamental challenges in decision-making, as well as highlight its impact on the performance trade-offs of AI agents. Through these benefits, we then cover the families of methods that discover temporal structure in HRL, ranging from learning directly from online experience to offline datasets, to leveraging large language models (LLMs). Finally, we highlight the challenges of temporal structure discovery and the domains that are particularly well-suited for such endeavours.

large language model, machine learning, reinforcement learning, (23 more...)

2506.14045

Country:

North America > Canada (0.92)
North America > United States (0.92)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.47)

Industry:

Health & Medicine (1.00)
Education (1.00)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(8 more...)

CALM: Consensus-Aware Localized Merging for Multi-Task Learning

Yan, Kunda, Zhang, Min, Cui, Sen, Qu, Zikun, Jiang, Bo, Liu, Feng, Zhang, Changshui

Model merging aims to integrate the strengths of multiple fine-tuned models into a unified model while preserving task-specific capabilities. Existing methods, represented by task arithmetic, are typically classified into global- and local-aware methods. However, global-aware methods inevitably cause parameter interference, while local-aware methods struggle to maintain the effectiveness of task-specific details in the merged model. To address these limitations, we propose a Consensus-Aware Localized Merging (CALM) method which incorporates localized information aligned with global task consensus, ensuring its effectiveness post-merging. CALM consists of three key components: (1) class-balanced entropy minimization sampling, providing a more flexible and reliable way to leverage unsupervised data; (2) an efficient-aware framework, selecting a small set of tasks for sequential merging with high scalability; (3) a consensus-aware mask optimization, aligning localized binary masks with global task consensus and merging them conflict-free. Experiments demonstrate the superiority and robustness of our CALM, significantly outperforming existing methods and achieving performance close to traditional MTL.

artificial intelligence, information, machine learning, (12 more...)

2506.13406

Country:

Asia > China > Beijing > Beijing (0.04)
North America > Canada (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Möllenhoff, Thomas, Swaroop, Siddharth, Doshi-Velez, Finale, Khan, Mohammad Emtiyaz

Federated ADMM from Bayesian Duality

arXiv.org Machine LearningJun-17-2025

ADMM is a popular method for federated deep learning which originated in the 1970s and, even though many new variants of it have been proposed since then, its core algorithmic structure has remained unchanged. Here, we take a major departure from the old structure and present a fundamentally new way to derive and extend federated ADMM. We propose to use a structure called Bayesian Duality which exploits a duality of the posterior distributions obtained by solving a variational-Bayesian reformulation of the original problem. We show that this naturally recovers the original ADMM when isotropic Gaussian posteriors are used, and yields non-trivial extensions for other posterior forms. For instance, full-covariance Gaussians lead to Newton-like variants of ADMM, while diagonal covariances result in a cheap Adam-like variant. This is especially useful to handle heterogeneity in federated deep learning, giving up to 7% accuracy improvements over recent baselines. Our work opens a new Bayesian path to improve primal-dual methods.

artificial intelligence, bayesadmm, machine learning, (17 more...)

arXiv.org Machine Learning

2506.1315

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Sweet, David, Jadhav, Siddhant anand

Taking the GP Out of the Loop

arXiv.org Machine LearningJun-17-2025

Bayesian optimization (BO) has traditionally solved black box problems where evaluation is expensive and, therefore, design-evaluation pairs (i.e., observations) are few. Recently, there has been growing interest in applying BO to problems where evaluation is cheaper and, thus, observations are more plentiful. An impediment to scaling BO to many observations, $N$, is the $O(N^3)$ scaling of a na{ï}ve query of the Gaussian process (GP) surrogate. Modern implementations reduce this to $O(N^2)$, but the GP remains a bottleneck. We propose Epistemic Nearest Neighbors (ENN), a surrogate that estimates function values and epistemic uncertainty from $K$ nearest-neighbor observations. ENN has $O(N)$ query time and omits hyperparameter fitting, leaving uncertainty uncalibrated. To accommodate the lack of calibration, we employ an acquisition method based on Pareto-optimal tradeoffs between predicted value and uncertainty. Our proposed method, TuRBO-ENN, replaces the GP surrogate in TuRBO with ENN and its Thompson sampling acquisition method with our Pareto-based alternative. We demonstrate numerically that TuRBO-ENN can reduce the time to generate proposals by one to two orders of magnitude compared to TuRBO and scales to thousands of observations.

artificial intelligence, machine learning, optimization, (19 more...)

arXiv.org Machine Learning

2506.12818

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

arXiv.org Machine LearningJun-17-2025

Learning Causality for Modern Machine Learning

Chen, Yongqiang

In the past decades, machine learning with Empirical Risk Minimization (ERM) has demonstrated great capability in learning and exploiting the statistical patterns from data, or even surpassing humans. Despite the success, ERM avoids the modeling of causality the way of understanding and handling changes, which is fundamental to human intelligence. When deploying models beyond the training environment, distribution shifts are everywhere. For example, an autopilot system often needs to deal with new weather conditions that have not been seen during training, An Al-aided drug discovery system needs to predict the biochemical properties of molecules with respect to new viruses such as COVID-19. It renders the problem of Out-of-Distribution (OOD) generalization challenging to conventional machine learning. In this thesis, we investigate how to incorporate and realize the causality for broader tasks in modern machine learning. In particular, we exploit the invariance implied by the principle of independent causal mechanisms (ICM), that is, the causal mechanisms generating the effects from causes do not inform or influence each other. Therefore, the conditional distribution between the target variable given its causes is invariant under distribution shifts. With the causal invariance principle, we first instantiate it to graphs -- a general data structure ubiquitous in many real-world industry and scientific applications, such as financial networks and molecules. Then, we shall see how learning the causality benefits many of the desirable properties of modern machine learning, in terms of (i) OOD generalization capability; (ii) interpretability; and (iii) robustness to adversarial attacks. Realizing the causality in machine learning, on the other hand, raises a dilemma for optimization in conventional machine learning, as it often contradicts the objective of ERM...

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2506.12226

Country:

Africa (0.13)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Information Technology > Security & Privacy (0.87)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.65)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Semenov, Andrei, Jaggi, Martin, Doikov, Nikita

Gradient-Normalized Smoothness for Optimization with Approximate Hessians

In this work, we develop new optimization algorithms that use approximate second-order information combined with the gradient regularization technique to achieve fast global convergence rates for both convex and non-convex objectives. The key innovation of our analysis is a novel notion called Gradient-Normalized Smoothness, which characterizes the maximum radius of a ball around the current point that yields a good relative approximation of the gradient field. Our theory establishes a natural intrinsic connection between Hessian approximation and the linearization of the gradient. Importantly, Gradient-Normalized Smoothness does not depend on the specific problem class of the objective functions, while effectively translating local information about the gradient field and Hessian approximation into the global behavior of the method. This new concept equips approximate second-order algorithms with universal global convergence guarantees, recovering state-of-the-art rates for functions with Hölder-continuous Hessians and third derivatives, quasi-self-concordant functions, as well as smooth classes in first-order optimization. These rates are achieved automatically and extend to broader classes, such as generalized self-concordant functions. We demonstrate direct applications of our results for global linear rates in logistic regression and softmax problems with approximate Hessians, as well as in non-convex optimization using Fisher and Gauss-Newton approximations.

artificial intelligence, hessian, machine learning, (15 more...)

2506.1371

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Gulisano, Martino, Masoni, Matteo, Gabiccini, Marco, Guiggiani, Massimo

Disturbance-aware minimum-time planning strategies for motorsport vehicles with probabilistic safety certificates

This paper presents a disturbance-aware framework that embeds robustness into minimum-lap-time trajectory optimization for motorsport. Two formulations are introduced. (i) Open-loop, horizon-based covariance propagation uses worst-case uncertainty growth over a finite window to tighten tire-friction and track-limit constraints. (ii) Closed-loop, covariance-aware planning incorporates a time-varying LQR feedback law in the optimizer, providing a feedback-consistent estimate of disturbance attenuation and enabling sharper yet reliable constraint tightening. Both methods yield reference trajectories for human or artificial drivers: in autonomous applications the modelled controller can replicate the on-board implementation, while for human driving accuracy increases with the extent to which the driver can be approximated by the assumed time-varying LQR policy. Computational tests on a representative Barcelona-Catalunya sector show that both schemes meet the prescribed safety probability, yet the closed-loop variant incurs smaller lap-time penalties than the more conservative open-loop solution, while the nominal (non-robust) trajectory remains infeasible under the same uncertainties. By accounting for uncertainty growth and feedback action during planning, the proposed framework delivers trajectories that are both performance-optimal and probabilistically safe, advancing minimum-time optimization toward real-world deployment in high-performance motorsport and autonomous racing.

artificial intelligence, constraint, optimization problem, (17 more...)

2506.13622

Country: Europe > Netherlands (0.28)

Genre:

Research Report (0.50)
Workflow (0.47)

Industry:

Leisure & Entertainment > Sports > Motorsports (1.00)
Energy (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

A Survey on Imitation Learning for Contact-Rich Tasks in Robotics

Tsuji, Toshiaki, Kato, Yasuhiro, Solak, Gokhan, Zhang, Heng, Petrič, Tadej, Nori, Francesco, Ajoudani, Arash

This paper comprehensively surveys research trends in imitation learning for contact-rich robotic tasks. Contact-rich tasks, which require complex physical interactions with the environment, represent a central challenge in robotics due to their nonlinear dynamics and sensitivity to small positional deviations. The paper examines demonstration collection methodologies, including teaching methods and sensory modalities crucial for capturing subtle interaction dynamics. We then analyze imitation learning approaches, highlighting their applications to contact-rich manipulation. Recent advances in multimodal learning and foundation models have significantly enhanced performance in complex contact tasks across industrial, household, and healthcare domains. Through systematic organization of current research and identification of challenges, this survey provides a foundation for future advancements in contact-rich robotic manipulation.

large language model, machine learning, reinforcement learning, (20 more...)

2506.13498

Country:

Europe (0.92)
Asia > Japan > Honshū > Kantō (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.67)
Research Report > Experimental Study (0.45)

Industry:

Information Technology (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Nokhiz, Pegah, Ruwanpathirana, Aravinda Kanchana, Nissenbaum, Helen

Rethinking Optimization: A Systems-Based Approach to Social Externalities

Optimization is widely used for decision making across various domains, valued for its ability to improve efficiency. However, poor implementation practices can lead to unintended consequences, particularly in socioeconomic contexts where externalities (costs or benefits to third parties outside the optimization process) are significant. To propose solutions, it is crucial to first characterize involved stakeholders, their goals, and the types of subpar practices causing unforeseen outcomes. This task is complex because affected stakeholders often fall outside the direct focus of optimization processes. Also, incorporating these externalities into optimization requires going beyond traditional economic frameworks, which often focus on describing externalities but fail to address their normative implications or interconnected nature, and feedback loops. This paper suggests a framework that combines systems thinking with the economic concept of externalities to tackle these challenges. This approach aims to characterize what went wrong, who was affected, and how (or where) to include them in the optimization process. Economic externalities, along with their established quantification methods, assist in identifying "who was affected and how" through stakeholder characterization. Meanwhile, systems thinking (an analytical approach to comprehending relationships in complex systems) provides a holistic, normative perspective. Systems thinking contributes to an understanding of interconnections among externalities, feedback loops, and determining "when" to incorporate them in the optimization. Together, these approaches create a comprehensive framework for addressing optimization's unintended consequences, balancing descriptive accuracy with normative objectives. Using this, we examine three common types of subpar practices: ignorance, error, and prioritization of short-term goals.

artificial intelligence, externality, machine learning, (16 more...)

2506.12825

Country: North America > United States (0.67)

Genre: Research Report (0.84)

Industry:

Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Data Science (0.67)