Energy
Multifunctional physical reservoir computing in soft tensegrity robots
Terajima, Ryo, Inoue, Katsuma, Nakajima, Kohei, Kuniyoshi, Yasuo
Recent studies have demonstrated that the dynamics of physical systems can be utilized for the desired information processing under the framework of physical reservoir computing (PRC). Robots with soft bodies are examples of such physical systems, and their nonlinear body-environment dynamics can be used to compute and generate the motor signals necessary for the control of their own behavior. In this simulation study, we extend this approach to control and embed not only one but also multiple behaviors into a type of soft robot called a tensegrity robot. The resulting system, consisting of the robot and the environment, is a multistable dynamical system that converges to different attractors from varying initial conditions. Furthermore, attractor analysis reveals that there exist "untrained attractors" in the state space of the system outside the training data. These untrained attractors reflect the intrinsic properties and structures of the tensegrity robot and its interactions with the environment. The impacts of these recent findings in PRC remain unexplored in embodied AI research. We here illustrate their potential to understand various features of embodied cognition that have not been fully addressed to date.
Sync-TVA: A Graph-Attention Framework for Multimodal Emotion Recognition with Cross-Modal Fusion
Deng, Zeyu, Lu, Yanhui, Liao, Jiashu, Wu, Shuang, Wei, Chongfeng
--Multimodal emotion recognition (MER) is crucial for enabling emotionally intelligent systems that perceive and respond to human emotions. However, existing methods suffer from limited cross-modal interaction and imbalanced contributions across modalities. T o address these issues, we propose Sync-TV A, an end-to-end graph-attention framework featuring modality-specific dynamic enhancement and structured cross-modal fusion. Our design incorporates a dynamic enhancement module for each modality and constructs heterogeneous cross-modal graphs to model semantic relations across text, audio, and visual features. Experiments on MELD and IEMOCAP demonstrate consistent improvements over state-of-the-art models in both accuracy and weighted F1 score, especially under class-imbalanced conditions. ITH the rapid development of artificial intelligence (AI) and robotics, robots are taking on increasingly prominent roles in various aspects of daily life, and the demand for emotional intelligence has become more critical than ever [1]. Modern computer systems can already capable of exhibiting a degree of empathy and perform sophisticated emotional analysis [2]. For domestic robots, accurately recognising and responding to human emotions is vital for building and sustaining harmonious, long-term relationships--particularly after the initial novelty of interaction fades and the need for continuous re-engagement arises [3]. However, the integration between emotion recognition and facial recognition systems remains insufficient and has yet to result in a unified framework. This is mainly because these two research fields have historically evolved independently, rather than being systematically studied and integrated into a collaborative recognition system [4].
Load Balancing for AI Training Workloads
McClure, Sarah, Ratnasamy, Sylvia, Shenker, Scott
We investigate the performance of various load balancing algorithms for large-scale AI training workloads that are running on dedicated infrastructure. The performance of load balancing depends on both the congestion control and loss recovery algorithms, so our evaluation also sheds light on the appropriate choices for those designs as well.
DEM-NeRF: A Neuro-Symbolic Method for Scientific Discovery through Physics-Informed Simulation
Tan, Wenkai, Velasquez, Alvaro, Song, Houbing
Neural networks have emerged as a powerful tool for modeling physical systems, offering the ability to learn complex representations from limited data while integrating foundational scientific knowledge. In particular, neuro-symbolic approaches that combine data-driven learning, the neuro, with symbolic equations and rules, the symbolic, address the tension between methods that are purely empirical, which risk straying from established physical principles, and traditional numerical solvers that demand complete geometric knowledge and can be prohibitively expensive for high-fidelity simulations. In this work, we present a novel neuro-symbolic framework for reconstructing and simulating elastic objects directly from sparse multi-view image sequences, without requiring explicit geometric information. Specifically, we integrate a neural radiance field (NeRF) for object reconstruction with physics-informed neural networks (PINN) that incorporate the governing partial differential equations of elasticity. In doing so, our method learns a spatiotemporal representation of deforming objects that leverages both image supervision and symbolic physical constraints. To handle complex boundary and initial conditions, which are traditionally confronted using finite element methods, boundary element methods, or sensor-based measurements, we employ an energy-constrained Physics-Informed Neural Network architecture. This design enhances both simulation accuracy and the explainability of results.
Autonomous Exploration with Terrestrial-Aerial Bimodal Vehicles
Gao, Yuman, Zhang, Ruibin, Lai, Tiancheng, Cao, Yanjun, Xu, Chao, Gao, Fei
Terrestrial-aerial bimodal vehicles, which integrate the high mobility of aerial robots with the long endurance of ground robots, offer significant potential for autonomous exploration. Given the inherent energy and time constraints in practical exploration tasks, we present a hierarchical framework for the bimodal vehicle to utilize its flexible locomotion modalities for exploration. Beginning with extracting environmental information to identify informative regions, we generate a set of potential bimodal viewpoints. To adaptively manage energy and time constraints, we introduce an extended Monte Carlo Tree Search approach that strategically optimizes both modality selection and viewpoint sequencing. Combined with an improved bimodal vehicle motion planner, we present a complete bimodal energy- and time-aware exploration system. Extensive simulations and deployment on a customized real-world platform demonstrate the effectiveness of our system.
Agentic Web: Weaving the Next Web with AI Agents
Yang, Yingxuan, Ma, Mulei, Huang, Yuxuan, Chai, Huacan, Gong, Chenyu, Geng, Haoran, Zhou, Yuanjian, Wen, Ying, Fang, Meng, Chen, Muhao, Gu, Shangding, Jin, Ming, Spanos, Costas, Yang, Yang, Abbeel, Pieter, Song, Dawn, Zhang, Weinan, Wang, Jun
The emergence of AI agents powered by large language models (LLMs) marks a pivotal shift toward the Agentic Web, a new phase of the internet defined by autonomous, goal-driven interactions. In this paradigm, agents interact directly with one another to plan, coordinate, and execute complex tasks on behalf of users. This transition from human-driven to machine-to-machine interaction allows intent to be delegated, relieving users from routine digital operations and enabling a more interactive, automated web experience. In this paper, we present a structured framework for understanding and building the Agentic Web. We trace its evolution from the PC and Mobile Web eras and identify the core technological foundations that support this shift. Central to our framework is a conceptual model consisting of three key dimensions: intelligence, interaction, and economics. These dimensions collectively enable the capabilities of AI agents, such as retrieval, recommendation, planning, and collaboration. We analyze the architectural and infrastructural challenges involved in creating scalable agentic systems, including communication protocols, orchestration strategies, and emerging paradigms such as the Agent Attention Economy. We conclude by discussing the potential applications, societal risks, and governance issues posed by agentic systems, and outline research directions for developing open, secure, and intelligent ecosystems shaped by both human intent and autonomous agent behavior. A continuously updated collection of relevant studies for agentic web is available at: https://github.com/SafeRL-Lab/agentic-web.
Handling Out-of-Distribution Data: A Survey
Tamang, Lakpa, Bouadjenek, Mohamed Reda, Dazeley, Richard, Aryal, Sunil
In the field of Machine Learning (ML) and data-driven applications, one of the significant challenge is the change in data distribution between the training and deployment stages, commonly known as distribution shift. This paper outlines different mechanisms for handling two main types of distribution shifts: (i) Covariate shift: where the value of features or covariates change between train and test data, and (ii) Concept/Semantic-shift: where model experiences shift in the concept learned during training due to emergence of novel classes in the test phase. We sum up our contributions in three folds. First, we formalize distribution shifts, recite on how the conventional method fails to handle them adequately and urge for a model that can simultaneously perform better in all types of distribution shifts. Second, we discuss why handling distribution shifts is important and provide an extensive review of the methods and techniques that have been developed to detect, measure, and mitigate the effects of these shifts. Third, we discuss the current state of distribution shift handling mechanisms and propose future research directions in this area. Overall, we provide a retrospective synopsis of the literature in the distribution shift, focusing on OOD data that had been overlooked in the existing surveys.
Deep Reinforcement Learning for Real-Time Green Energy Integration in Data Centers
--This paper explores the implementation of a Deep Reinforcement Learning (DRL)-Optimized energy management system for e-commerce data centers, aimed at enhancing energy efficiency, cost-effectiveness, and environmental sustainability. The proposed system leverages DRL algorithms to dynamically manage the integration of renewable energy sources, energy storage, and grid power, adapting to fluctuating energy availability in real-time. The study demonstrates that the DRL-Optimized system achieves a 38% reduction in energy costs, significantly outperforming traditional Reinforcement Learning (RL) methods (28%) and heuristic approaches (22%). Additionally, it maintains a low SLA violation rate of 1.5%, compared to 3.0% for RL and 4.8% for heuristic methods. The DRL-Optimized approach also results in an 82% improvement in energy efficiency, surpassing other methods, and a 45% reduction in carbon emissions, making it the most environmentally friendly solution. The system's cumulative reward of 950 reflects its superior performance in balancing multiple objectives. As global e-commerce demand continues to surge, data centers have experienced a significant increase in energy consumption, making energy efficiency an ever more pressing issue. Data centers, the backbone of e-commerce operations, must function continuously to support this infrastructure, resulting in high energy costs and a considerable carbon footprint [1]-[4].
RATE: An LLM-Powered Retrieval Augmented Generation Technology-Extraction Pipeline
Mirhosseini, Karan, Aftab, Arya, Sheikh, Alireza
In an era of radical technology transformations, technology maps play a crucial role in enhancing decision making. These maps heavily rely on automated methods of technology extraction. This paper introduces Retrieval Augmented Technology Extraction (RATE), a Large Language Model (LLM) based pipeline for automated technology extraction from scientific literature. RATE combines Retrieval Augmented Generation (RAG) with multi-definition LLM-based validation. This hybrid method results in high recall in candidate generation alongside with high precision in candidate filtering. While the pipeline is designed to be general and widely applicable, we demonstrate its use on 678 research articles focused on Brain-Computer Interfaces (BCIs) and Extended Reality (XR) as a case study. Consequently, The validated technology terms by RATE were mapped into a co-occurrence network, revealing thematic clusters and structural features of the research landscape. For the purpose of evaluation, a gold standard dataset of technologies in 70 selected random articles had been curated by the experts. In addition, a technology extraction model based on Bidirectional Encoder Representations of Transformers (BERT) was used as a comparative method. RATE achieved F1-score of 91.27%, Significantly outperforming BERT with F1-score of 53.73%. Our findings highlight the promise of definition-driven LLM methods for technology extraction and mapping. They also offer new insights into emerging trends within the BCI-XR field. The source code is available https://github.com/AryaAftab/RATE
VizGenie: Toward Self-Refining, Domain-Aware Workflows for Next-Generation Scientific Visualization
Biswas, Ayan, Turton, Terece L., Ranasinghe, Nishath Rajiv, Jones, Shawn, Love, Bradley, Jones, William, Hagberg, Aric, Shen, Han-Wei, DeBardeleben, Nathan, Lawrence, Earl
We present VizGenie, a self-improving, agentic framework that advances scientific visualization through large language model (LLM) by orchestrating of a collection of domain-specific and dynamically generated modules. Users initially access core functionalities--such as threshold-based filtering, slice extraction, and statistical analysis--through pre-existing tools. For tasks beyond this baseline, VizGenie autonomously employs LLMs to generate new visualization scripts (e.g., VTK Python code), expanding its capabilities on-demand. Each generated script undergoes automated backend validation and is seamlessly integrated upon successful testing, continuously enhancing the system's adaptability and robustness. A distinctive feature of VizGenie is its intuitive natural language interface, allowing users to issue high-level feature-based queries (e.g., ``visualize the skull"). The system leverages image-based analysis and visual question answering (VQA) via fine-tuned vision models to interpret these queries precisely, bridging domain expertise and technical implementation. Additionally, users can interactively query generated visualizations through VQA, facilitating deeper exploration. Reliability and reproducibility are further strengthened by Retrieval-Augmented Generation (RAG), providing context-driven responses while maintaining comprehensive provenance records. Evaluations on complex volumetric datasets demonstrate significant reductions in cognitive overhead for iterative visualization tasks. By integrating curated domain-specific tools with LLM-driven flexibility, VizGenie not only accelerates insight generation but also establishes a sustainable, continuously evolving visualization practice. The resulting platform dynamically learns from user interactions, consistently enhancing support for feature-centric exploration and reproducible research in scientific visualization.