AITopics

Ensemble control aims to steer a population of dynamical systems using a shared control input. This paper introduces a constrained ensemble control framework for parameterized, heterogeneous robotic systems operating under state and environmental constraints, such as obstacle avoidance. We develop a moment kernel transform that maps the parameterized ensemble dynamics to the moment system in a kernel space, enabling the characterization of population-level behavior. The state-space constraints, such as polyhedral waypoints to be visited and obstacles to be avoided, are also transformed into the moment space, leading to a unified formulation for safe, large-scale ensemble control. Expressive signal temporal logic specifications are employed to encode complex visit-avoid tasks, which are achieved through a single shared controller synthesized from our constrained ensemble control formulation. Simulation and hardware experiments demonstrate the effectiveness of the proposed approach in safely and efficiently controlling robotic ensembles within constrained environments.

artificial intelligence, constraint, planning & scheduling, (16 more...)

2512.04502

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Energy (0.68)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.36)

Ghose, Debasmita, Gitelson, Oz, Vazquez, Marynel, Scassellati, Brian

Open-Ended Goal Inference through Actions and Language for Human-Robot Collaboration

To collaborate with humans, robots must infer goals that are often ambiguous, difficult to articulate, or not drawn from a fixed set. Prior approaches restrict inference to a predefined goal set, rely only on observed actions, or depend exclusively on explicit instructions, making them brittle in real-world interactions. We present BALI (Bidirectional Action-Language Inference) for goal prediction, a method that integrates natural language preferences with observed human actions in a receding-horizon planning tree. BALI combines language and action cues from the human, asks clarifying questions only when the expected information gain from the answer outweighs the cost of interruption, and selects supportive actions that align with inferred goals. We evaluate the approach in collaborative cooking tasks, where goals may be novel to the robot and unbounded. Compared to baselines, BALI yields more stable goal predictions and significantly fewer mistakes.

artificial intelligence, belief revision, robot, (17 more...)

2512.04453

Country:

Asia > Indonesia > Bali (0.69)
North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Gao, Xiangyi, Zhao, Danpei, Yuan, Bo, Li, Wentao

Knowledge distillation is an effective and hardware-friendly method, which plays a key role in lightweighting remote sensing object detection. However, existing distillation methods often encounter the issue of mixed features in remote sensing images (RSIs), and neglect the discrepancies caused by subtle feature variations, leading to entangled knowledge confusion. To address these challenges, we propose an architecture-agnostic distillation method named Dual-Stream Spectral Decoupling Distillation (DS2D2) for universal remote sensing object detection tasks. Specifically, DS2D2 integrates explicit and implicit distillation grounded in spectral decomposition. Firstly, the first-order wavelet transform is applied for spectral decomposition to preserve the critical spatial characteristics of RSIs. Leveraging this spatial preservation, a Density-Independent Scale Weight (DISW) is designed to address the challenges of dense and small object detection common in RSIs. Secondly, we show implicit knowledge hidden in subtle student-teacher feature discrepancies, which significantly influence predictions when activated by detection heads. This implicit knowledge is extracted via full-frequency and high-frequency amplifiers, which map feature differences to prediction deviations. Extensive experiments on DIOR and DOTA datasets validate the effectiveness of the proposed method. Specifically, on DIOR dataset, DS2D2 achieves improvements of 4.2% in AP50 for RetinaNet and 3.8% in AP50 for Faster R-CNN, outperforming existing distillation approaches. The source code will be available at https://github.com/PolarAid/DS2D2.

artificial intelligence, distillation, machine learning, (15 more...)

doi: 10.1109/TGRS.2025.3600098

2512.04413

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Development of a 15-Degree-of-Freedom Bionic Hand with Cable-Driven Transmission and Distributed Actuation

Han, Haoqi, Yang, Yi, Yu, Yifei, Zhou, Yixuan, Zhu, Xiaohan, Wang, Hesheng

Abstract--In robotic hand research, minimizing the number of actuators while maintaining human-hand-consistent dimensions and degrees of freedom constitutes a fundamental challenge. Drawing bio-inspiration from human hand kinematic configurations and muscle distribution strategies, this work proposes a novel 15-DoF dexterous robotic hand, with detailed analysis of its mechanical architecture, electrical system, and control system. The bionic hand employs a new tendon-driven mechanism, significantly reducing the number of motors required by traditional tendon-driven systems while enhancing motion performance and simplifying the mechanical structure. This design integrates five motors in the forearm to provide strong gripping force, while ten small motors are installed in the palm to support fine manipulation tasks. Additionally, a corresponding joint sensing and motor driving electrical system was developed to ensure efficient control and feedback. The entire system weighs only 1.4kg, combining lightweight and high-performance features. Through experiments, the bionic hand exhibited exceptional dexterity and robust grasping capabilities, demonstrating significant potential for robotic manipulation tasks. HE development of actuator systems with human-level dexterity presents significant challenges [1], [2], stemming from the bio-integrated nature of the human hand: it is not an isolated entity but a highly coupled system intricately connected through skeletal-muscular-neural networks to the forearm, forming a synergistic functional unit.

artificial intelligence, bionic hand, mechanism, (12 more...)

2512.04399

Country:

Asia > China (0.75)
North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.88)

Falk, Sophia, Corrêa, Nicholas Kluge, Luccioni, Sasha, Biber-Freudenberger, Lisa, van Wynsberghe, Aimee

From FLOPs to Footprints: The Resource Cost of Artificial Intelligence

As computational demands continue to rise, assessing the environmental footprint of AI requires moving beyond energy and water consumption to include the material demands of specialized hardware. This study quantifies the material footprint of AI training by linking computational workloads to physical hardware needs. The elemental composition of the Nvidia A100 SXM 40 GB graphics processing unit (GPU) was analyzed using inductively coupled plasma optical emission spectroscopy, which identified 32 elements. The results show that AI hardware consists of about 90% heavy metals and only trace amounts of precious metals. The elements copper, iron, tin, silicon, and nickel dominate the GPU composition by mass. In a multi-step methodology, we integrate these measurements with computational throughput per GPU across varying lifespans, accounting for the computational requirements of training specific AI models at different training efficiency regimes. Scenario-based analyses reveal that, depending on Model FLOPs Utilization (MFU) and hardware lifespan, training GPT-4 requires between 1,174 and 8,800 A100 GPUs, corresponding to the extraction and eventual disposal of up to 7 tons of toxic elements. Combined software and hardware optimization strategies can reduce material demands: increasing MFU from 20% to 60% lowers GPU requirements by 67%, while extending lifespan from 1 to 3 years yields comparable savings; implementing both measures together reduces GPU needs by up to 93%. Our findings highlight that incremental performance gains, such as those observed between GPT-3.5 and GPT-4, come at disproportionately high material costs. The study underscores the necessity of incorporating material resource considerations into discussions of AI scalability, emphasizing that future progress in AI must align with principles of resource efficiency and environmental responsibility.

accessed, large language model, machine learning, (19 more...)

2512.04142

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Water & Waste Management > Water Management (1.00)
Materials > Metals & Mining (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Decoding Large Language Diffusion Models with Foreseeing Movement

Mo, Yichuan, Chen, Quan, Li, Mingjie, Wei, Zeming, Wang, Yisen

Large Language Diffusion Models (LLDMs) benefit from a flexible decoding mechanism that enables parallelized inference and controllable generations over autoregressive models. Yet such flexibility introduces a critical challenge: inference performance becomes highly sensitive to the decoding order of tokens. Existing heuristic methods, however, focus mainly on local effects while overlooking long-term impacts. To address this limitation, we propose the Foreseeing Decoding Method (FDM), a novel approach that integrates both local and global considerations to unlock the full potential, employing a search-based strategy to enable effective optimization in discrete spaces. Furthermore, by analyzing the consistency of chosen tokens in the full decoding process, we develop a variant, FDM with Acceleration (FDM-A), which restricts deep exploration to critical steps identified as the exploration and balance circumantences. Extensive experiments across diverse benchmarks and model architectures validate the scalability of FDM and demonstrate the superior efficiency-performance trade-off achieved by FDM-A. Our work might potentially provide a principled step toward more powerful decoding methods for LLDMs.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

2512.04135

Genre: Research Report > Promising Solution (0.34)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

The changing surface of the world's roads

Randhawa, Sukanya, Randhawa, Guntaj, Langer, Clemens, Andorful, Francis, Herfort, Benjamin, Kwakye, Daniel, Olchik, Omer, Lautenbach, Sven, Zipf, Alexander

Resilient road infrastructure is a cornerstone of the UN Sustainable Development Goals. Yet a primary indicator of network functionality and resilience is critically lacking: a comprehensive global baseline of road surface information. Here, we overcome this gap by applying a deep learning framework to a global mosaic of Planetscope satellite imagery from 2020 and 2024. The result is the first global multi-temporal dataset of road pavedness and width for 9.2 million km of critical arterial roads, achieving 95.5% coverage where nearly half the network was previously unclassified. This dataset reveals a powerful multi-scale geography of human development. At the planetary scale, we show that the rate of change in pavedness is a robust proxy for a country's development trajectory (correlation with HDI = 0.65). At the national scale, we quantify how unpaved roads constitute a fragile backbone for economic connectivity. We further synthesize our data into a global Humanitarian Passability Matrix with direct implications for humanitarian logistics. At the local scale, case studies demonstrate the framework's versatility: in Ghana, road quality disparities expose the spatial outcomes of governance; in Pakistan, the data identifies infrastructure vulnerabilities to inform climate resilience planning. Together, this work delivers both a foundational dataset and a multi-scale analytical framework for monitoring global infrastructure, from the dynamics of national development to the realities of local governance, climate adaptation, and equity. Unlike traditional proxies such as nighttime lights, which reflect economic activity, road surface data directly measures the physical infrastructure that underpins prosperity and resilience - at higher spatial resolution.

artificial intelligence, deep learning, machine learning, (17 more...)

2512.04092

Country:

Africa > Ghana (0.50)
North America > United States (0.46)
Asia > Pakistan (0.34)
Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Banking & Finance > Economy (0.88)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

Han, X. Y., Zhong, Yuan

In large-scale AI training, Sparse Mixture-of-Experts (s-MoE) layers enable scaling by activating only a small subset of experts per token. An operational challenge in this design is load balancing: routing tokens to minimize the number of idle experts, which is important for the efficient utilization of (costly) GPUs. We provide a theoretical framework for analyzing the Auxiliary-Loss-Free Load Balancing (ALF-LB) procedure -- proposed by DeepSeek's Wang et al. (2024) -- by casting it as a one-step-per-iteration primal-dual method for an assignment problem. First, in a stylized deterministic setting, our framework yields several insightful structural properties: (i) a monotonic improvement of a Lagrangian objective, (ii) a preference rule that moves tokens from overloaded to underloaded experts, and (iii) an approximate-balancing guarantee. Then, we incorporate the stochastic and dynamic nature of AI training using a generalized online optimization formulation. In the online setting, we derive a strong convexity property of the objective that leads to a logarithmic expected regret bound under certain step-size choices. Additionally, we present real experiments on 1B-parameter DeepSeekMoE models to complement our theoretical findings. Together, these results build a principled framework for analyzing the Auxiliary-Loss-Free Load Balancing of s-MoE in AI models.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2512.03915

Genre: Research Report (0.50)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

EVER: Edge-Assisted Auto-Verification for Mobile MR-Aided Operation

Chen, Jiangong, Zhu, Mingyu, Li, Bin

Mixed Reality (MR)-aided operation overlays digital objects on the physical world to provide a more immersive and intuitive operation process. A primary challenge is the precise and fast auto-verification of whether the user follows MR guidance by comparing frames before and after each operation. The pre-operation frame includes virtual guiding objects, while the post-operation frame contains physical counterparts. Existing approaches fall short of accounting for the discrepancies between physical and virtual objects due to imperfect 3D modeling or lighting estimation. In this paper, we propose EVER: an edge-assisted auto-verification system for mobile MR-aided operations. Unlike traditional frame-based similarity comparisons, EVER leverages the segmentation model and rendering pipeline adapted to the unique attributes of frames with physical pieces and those with their virtual counterparts; it adopts a threshold-based strategy using Intersection over Union (IoU) metrics for accurate auto-verification. To ensure fast auto-verification and low energy consumption, EVER offloads compute-intensive tasks to an edge server. Through comprehensive evaluations of public datasets and custom datasets with practical implementation, EVER achieves over 90% verification accuracy within 100 milliseconds (significantly faster than average human reaction time of approximately 273 milliseconds), while consuming only minimal additional computational resources and energy compared to a system without auto-verification.

artificial intelligence, machine learning, target frame, (18 more...)

doi: 10.1109/ISMAR67309.2025.00148

2510.18224

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Energy (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Locality-Sensitive Hashing-Based Efficient Point Transformer for Charged Particle Reconstruction

Govil, Shitij, Rodgers, Jack P., Chou, Yuan-Tang, Miao, Siqi, Saha, Amit, Anand, Advaith, Lieret, Kilian, DeZoort, Gage, Liu, Mia, Duarte, Javier, Li, Pan, Hsu, Shih-Chieh

Charged particle track reconstruction is a foundational task in collider experiments and the main computational bottleneck in particle reconstruction. Graph neural networks (GNNs) have shown strong performance for this problem, but costly graph construction, irregular computations, and random memory access patterns substantially limit their throughput. The recently proposed Hashing-based Efficient Point Transformer (HEPT) offers a theoretically guaranteed near-linear complexity for large point cloud processing via locality-sensitive hashing (LSH) in attention computations; however, its evaluations have largely focused on embedding quality, and the object condensation pipeline on which HEPT relies requires a post-hoc clustering step (e.g., DBScan) that can dominate runtime. In this work, we make two contributions. First, we present a unified, fair evaluation of physics tracking performance for HEPT and a representative GNN-based pipeline under the same dataset and metrics. Second, we introduce HEPTv2 by extending HEPT with a lightweight decoder that eliminates the clustering stage and directly predicts track assignments. This modification preserves HEPT's regular, hardware-friendly computations while enabling ultra-fast end-to-end inference. On the TrackML dataset, optimized HEPTv2 achieves approximately 28 ms per event on an A100 while maintaining competitive tracking efficiency. These results position HEPTv2 as a practical, scalable alternative to GNN-based pipelines for fast tracking.

artificial intelligence, machine learning, particle, (12 more...)

2510.07594

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.69)
Government > Regional Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)