AITopics

Policy Mirror Descent (PMD) has emerged as a unifying framework in reinforcement learning (RL) by linking policy gradient methods with a first-order optimization method known as mirror descent. At its core, PMD incorporates two key regularization components: (i) a distance term that enforces a trust region for stable policy updates and (ii) an MDP regularizer that augments the reward function to promote structure and robustness. While PMD has been extensively studied in theory, empirical investigations remain scarce. This work provides a large-scale empirical analysis of the interplay between these two regularization techniques, running over 500k training seeds on small RL environments. Our results demonstrate that, although the two regularizers can partially substitute each other, their precise combination is critical for achieving robust performance.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2507.08718

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

System-of-systems Modeling and Optimization: An Integrated Framework for Intermodal Mobility

Saves, Paul, Bussemaker, Jasper, Lafage, Rémi, Lefebvre, Thierry, Bartoli, Nathalie, Diouane, Youssef, Morlier, Joseph

For developing innovative systems architectures, modeling and optimization techniques have been central to frame the architecting process and define the optimization and modeling problems. In this context, for system-of-systems the use of efficient dedicated approaches (often physics-based simulations) is highly recommended to reduce the computational complexity of the targeted applications. However, exploring novel architectures using such dedicated approaches might pose challenges for optimization algorithms, including increased evaluation costs and potential failures. To address these challenges, surrogate-based optimization algorithms, such as Bayesian optimization utilizing Gaussian process models have emerged.

artificial intelligence, optimization, optimization problem, (13 more...)

2507.08715

Country:

Europe (0.70)
North America > United States (0.46)

Genre: Research Report (0.40)

Industry:

Transportation > Air (0.98)
Aerospace & Defense > Aircraft (0.71)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Deep Hashing with Semantic Hash Centers for Image Retrieval

Chen, Li, Liu, Rui, Zhou, Yuxiang, Ma, Xudong, Chen, Yong, Zhang, Dell

Deep hashing is an effective approach for large-scale image retrieval. Current methods are typically classified by their supervision types: point-wise, pair-wise, and list-wise. Recent point-wise techniques (e.g., CSQ, MDS) have improved retrieval performance by pre-assigning a hash center to each class, enhancing the discriminability of hash codes across various datasets. However, these methods rely on data-independent algorithms to generate hash centers, which neglect the semantic relationships between classes and may degrade retrieval performance. This paper introduces the concept of semantic hash centers, building on the idea of traditional hash centers. We hypothesize that hash centers of semantically related classes should have closer Hamming distances, while those of unrelated classes should be more distant. To this end, we propose a three-stage framework, SHC, to generate hash codes that preserve semantic structure. First, we develop a classification network to identify semantic similarities between classes using a data-dependent similarity calculation that adapts to varying data distributions. Second, we introduce an optimization algorithm to generate semantic hash centers, preserving semantic relatedness while enforcing a minimum distance between centers to avoid excessively similar hash codes. Finally, a deep hashing network is trained using these semantic centers to convert images into binary hash codes. Experimental results on large-scale retrieval tasks across several public datasets show that SHC significantly improves retrieval performance. Specifically, SHC achieves average improvements of +7.26%, +7.62%, and +11.71% in MAP@100, MAP@1000, and MAP@ALL metrics, respectively, over state-of-the-art methods.

artificial intelligence, machine learning, natural language, (20 more...)

2507.08404

Country: Asia > China (0.47)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Towards AI-Native RAN: An Operator's Perspective of 6G Day 1 Standardization

Li, Nan, Sun, Qi, Wang, Lehan, Xu, Xiaofei, Huang, Jinri, Liu, Chunhui, Gao, Jing, Huang, Yuhong, I, Chih-Lin

Artificial Intelligence/Machine Learning (AI/ML) has become the most certain and prominent feature of 6G mobile networks. Unlike 5G, where AI/ML was not natively integrated but rather an add-on feature over existing architecture, 6G shall incorporate AI from the onset to address its complexity and support ubiquitous AI applications. Based on our extensive mobile network operation and standardization experience from 2G to 5G, this paper explores the design and standardization principles of AI-Native radio access networks (RAN) for 6G, with a particular focus on its critical Day 1 architecture, functionalities and capabilities. We investigate the framework of AI-Native RAN and present its three essential capabilities to shed some light on the standardization direction; namely, AI-driven RAN processing/optimization/automation, reliable AI lifecycle management (LCM), and AI-as-a-Service (AIaaS) provisioning. The standardization of AI-Native RAN, in particular the Day 1 features, including an AI-Native 6G RAN architecture, were proposed. For validation, a large-scale field trial with over 5000 5G-A base stations have been built and delivered significant improvements in average air interface latency, root cause identification, and network energy consumption with the proposed architecture and the supporting AI functions. This paper aims to provide a Day 1 framework for 6G AI-Native RAN standardization design, balancing technical innovation with practical deployment.

artificial intelligence, machine learning, ran, (17 more...)

2507.08403

Country: Asia > China (0.15)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Information Technology > Security & Privacy (0.93)
Telecommunications > Networks (0.68)
Information Technology > Networks (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Kashyap, Samarth, Ramakrishnan, Rohit K, Jyoti, Kumari, Patel, Apoorva D

Advances in Machine Learning: Where Can Quantum Techniques Help?

Quantum Machine Learning (QML) represents a promising frontier at the intersection of quantum computing and artificial intelligence, aiming to leverage quantum computational advantages to enhance data-driven tasks. This review explores the potential of QML to address the computational bottlenecks of classical machine learning, particularly in processing complex datasets. We introduce the theoretical foundations of QML, including quantum data encoding, quantum learning theory and optimization techniques, while categorizing QML approaches based on data type and computational architecture. It is well-established that quantum computational advantages are problem-dependent, and so potentially useful directions for QML need to be systematically identified. Key developments, such as Quantum Principal Component Analysis, quantum-enhanced sensing and applications in material science, are critically evaluated for their theoretical speed-ups and practical limitations. The challenges posed by Noisy Intermediate-Scale Quantum (NISQ) devices, including hardware noise, scalability constraints and data encoding overheads, are discussed in detail. We also outline future directions, emphasizing the need for quantum-native algorithms, improved error correction, and realistic benchmarks to bridge the gap between theoretical promise and practical deployment. This comprehensive analysis underscores that while QML has significant potential for specific applications such as quantum chemistry and sensing, its broader utility in real-world scenarios remains contingent on overcoming technological and methodological hurdles.

algorithm, artificial intelligence, machine learning, (17 more...)

2507.08379

Country: Asia > India (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Joint Optimization-based Targetless Extrinsic Calibration for Multiple LiDARs and GNSS-Aided INS of Ground Vehicles

Wang, Junhui, Qiao, Yan, Gao, Chao, Wu, Naiqi

Accurate extrinsic calibration between multiple LiDAR sensors and a GNSS-aided inertial navigation system (GINS) is essential for achieving reliable sensor fusion in intelligent mining environments. Such calibration enables vehicle-road collaboration by aligning perception data from vehicle-mounted sensors to a unified global reference frame. However, existing methods often depend on artificial targets, overlapping fields of view, or precise trajectory estimation, which are assumptions that may not hold in practice. Moreover, the planar motion of mining vehicles leads to observability issues that degrade calibration performance. This paper presents a targetless extrinsic calibration method that aligns multiple onboard LiDAR sensors to the GINS coordinate system without requiring overlapping sensor views or external targets. The proposed approach introduces an observation model based on the known installation height of the GINS unit to constrain unobservable calibration parameters under planar motion. A joint optimization framework is developed to refine both the extrinsic parameters and GINS trajectory by integrating multiple constraints derived from geometric correspondences and motion consistency. The proposed method is applicable to heterogeneous LiDAR configurations, including both mechanical and solid-state sensors. Extensive experiments on simulated and real-world datasets demonstrate the accuracy, robustness, and practical applicability of the approach under diverse sensor setups.

artificial intelligence, extrinsic parameter, optimization problem, (16 more...)

2507.08349

Country:

Asia (0.68)
Europe (0.46)
North America > Canada (0.28)

Genre: Research Report (0.82)

Industry: Materials > Metals & Mining (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Akhondzadeh, Mohammad Sadegh, Zargarbashi, Soroush H., Cao, Jimin, Bojchevski, Aleksandar

EvA: Evolutionary Attacks on Graphs

Even a slight perturbation in the graph structure can cause a significant drop in the accuracy of graph neural networks (GNNs). Most existing attacks leverage gradient information to perturb edges. This relaxes the attack's optimization problem from a discrete to a continuous space, resulting in solutions far from optimal. It also restricts the adaptability of the attack to non-differentiable objectives. Instead, we introduce a few simple yet effective enhancements of an evolutionary-based algorithm to solve the discrete optimization problem directly. Our Evolutionary Attack (EvA) works with any black-box model and objective, eliminating the need for a differentiable proxy loss. This allows us to design two novel attacks that reduce the effectiveness of robustness certificates and break conformal sets. The memory complexity of our attack is linear in the attack budget. Among our experiments, EvA shows $\sim$11\% additional drop in accuracy on average compared to the best previous attack, revealing significant untapped potential in designing attacks.

artificial intelligence, machine learning, perturbation, (19 more...)

2507.08212

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Mahdavinia, Pouria, Mahdavi, Mehrdad

Low-rank Momentum Factorization for Memory Efficient Training

Fine-tuning large foundation models presents significant memory challenges due to stateful optimizers like AdamW, often requiring several times more GPU memory than inference. While memory-efficient methods like parameter-efficient fine-tuning (e.g., LoRA) and optimizer state compression exist, recent approaches like GaLore bridge these by using low-rank gradient projections and subspace moment accumulation. However, such methods may struggle with fixed subspaces or computationally costly offline resampling (e.g., requiring full-matrix SVDs). We propose Momentum Factorized SGD (MoFaSGD), which maintains a dynamically updated low-rank SVD representation of the first-order momentum, closely approximating its full-rank counterpart throughout training. This factorization enables a memory-efficient fine-tuning method that adaptively updates the optimization subspace at each iteration. Crucially, MoFaSGD leverages the computed low-rank momentum factors to perform efficient spectrally normalized updates, offering an alternative to subspace moment accumulation. We establish theoretical convergence guarantees for MoFaSGD, proving it achieves an optimal rate for non-convex stochastic optimization under standard assumptions. Empirically, we demonstrate MoFaSGD's effectiveness on large language model alignment benchmarks, achieving a competitive trade-off between memory reduction (comparable to LoRA) and performance compared to state-of-the-art low-rank optimization methods.

large language model, machine learning, mofasgd, (20 more...)

2507.08091

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Rudakov, Evgenii, Shock, Jonathan, Lappi, Otto, Cowley, Benjamin Ultan

SSSUMO: Real-Time Semi-Supervised Submovement Decomposition

This paper introduces a SSSUMO, semi-supervised deep learning approach for submovement decomposition that achieves state-of-the-art accuracy and speed. While submovement analysis offers valuable insights into motor control, existing methods struggle with reconstruction accuracy, computational cost, and validation, due to the difficulty of obtaining hand-labeled data. We address these challenges using a semi-supervised learning framework. This framework learns from synthetic data, initially generated from minimum-jerk principles and then iteratively refined through adaptation to unlabeled human movement data. Our fully convolutional architecture with differentiable reconstruction significantly surpasses existing methods on both synthetic and diverse human motion datasets, demonstrating robustness even in high-noise conditions. Crucially, the model operates in real-time (less than a millisecond per input second), a substantial improvement over optimization-based techniques. This enhanced performance facilitates new applications in human-computer interaction, rehabilitation medicine, and motor control studies. We demonstrate the model's effectiveness across diverse human-performed tasks such as steering, rotation, pointing, object moving, handwriting, and mouse-controlled gaming, showing notable improvements particularly on challenging datasets where traditional methods largely fail. Training and benchmarking source code, along with pre-trained model weights, are made publicly available at https://github.com/dolphin-in-a-coma/sssumo.

artificial intelligence, machine learning, submovement, (21 more...)

2507.08028

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Maurer, Nathan, Kaushik, Harshal, Jacob, Roshni Anna, Zhang, Jie, Chowdhury, Souma

Learning-aided Bigraph Matching Approach to Multi-Crew Restoration of Damaged Power Networks Coupled with Road Transportation Networks

The resilience of critical infrastructure networks (CINs) after disruptions, such as those caused by natural hazards, depends on both the speed of restoration and the extent to which operational functionality can be regained. Allocating resources for restoration is a combinatorial optimal planning problem that involves determining which crews will repair specific network nodes and in what order. This paper presents a novel graph-based formulation that merges two interconnected graphs, representing crew and transportation nodes and power grid nodes, into a single heterogeneous graph. To enable efficient planning, graph reinforcement learning (GRL) is integrated with bigraph matching. GRL is utilized to design the incentive function for assigning crews to repair tasks based on the graph-abstracted state of the environment, ensuring generalization across damage scenarios. Two learning techniques are employed: a graph neural network trained using Proximal Policy Optimization and another trained via Neuroevolution. The learned incentive functions inform a bipartite graph that links crews to repair tasks, enabling weighted maximum matching for crew-to-task allocations. An efficient simulation environment that pre-computes optimal node-to-node path plans is used to train the proposed restoration planning methods. An IEEE 8500-bus power distribution test network coupled with a 21 square km transportation network is used as the case study, with scenarios varying in terms of numbers of damaged nodes, depots, and crews. Results demonstrate the approach's generalizability and scalability across scenarios, with learned policies providing 3-fold better performance than random policies, while also outperforming optimization-based solutions in both computation time (by several orders of magnitude) and power restored.

artificial intelligence, machine learning, node, (20 more...)

2506.19703

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Power Industry (1.00)
Transportation > Infrastructure & Services (0.89)
Transportation > Ground > Road (0.64)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.86)