AITopics | Information Fusion

Collaborating Authors

Information Fusion

News Overviews Instructional Materials AI-Alerts Classics

PISA: A Pragmatic Psych-Inspired Unified Memory System for Enhanced AI Agency

Jia, Shian, Huang, Ziyang, Wang, Xinbo, Zhang, Haofei, Song, Mingli

arXiv.org Artificial IntelligenceOct-21-2025

Memory systems are fundamental to AI agents, yet existing work often lacks adaptability to diverse tasks and overlooks the constructive and task-oriented role of AI agent memory. Drawing from Piaget's theory of cognitive development, we propose PISA, a pragmatic, psych-inspired unified memory system that addresses these limitations by treating memory as a constructive and adaptive process. To enable continuous learning and adaptability, PISA introduces a trimodal adaptation mechanism (i.e., schema updation, schema evolution, and schema creation) that preserves coherent organization while supporting flexible memory updates. Building on these schema-grounded structures, we further design a hybrid memory access architecture that seamlessly integrates symbolic reasoning with neural retrieval, significantly improving retrieval accuracy and efficiency. Our empirical evaluation, conducted on the existing LOCOMO benchmark and our newly proposed AggQA benchmark for data analysis tasks, confirms that PISA sets a new state-of-the-art by significantly enhancing adaptability and long-term knowledge retention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.15966

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

MILES: Modality-Informed Learning Rate Scheduler for Balancing Multimodal Learning

Guerra-Manzanares, Alejandro, Shamout, Farah E.

arXiv.org Artificial IntelligenceOct-21-2025

The aim of multimodal neural networks is to combine diverse data sources, referred to as modalities, to achieve enhanced performance compared to relying on a single modality. However, training of multimodal networks is typically hindered by modality overfitting, where the network relies excessively on one of the available modalities. This often yields sub-optimal performance, hindering the potential of multimodal learning and resulting in marginal improvements relative to unimodal models. In this work, we present the Modality-Informed Learning ratE Scheduler (MILES) for training multimodal joint fusion models in a balanced manner. MILES leverages the differences in modality-wise conditional utilization rates during training to effectively balance multimodal learning. The learning rate is dynamically adjusted during training to balance the speed of learning from each modality by the multimodal model, aiming for enhanced performance in both multimodal and unimodal predictions. We extensively evaluate MILES on four multimodal joint fusion tasks and compare its performance to seven state-of-the-art baselines. Our results show that MILES outperforms all baselines across all tasks and fusion methods considered in our study, effectively balancing modality usage during training. This results in improved multimodal performance and stronger modality encoders, which can be leveraged when dealing with unimodal samples or absent modalities. Overall, our work highlights the impact of balancing multimodal learning on improving model performance.

artificial intelligence, machine learning, modality, (16 more...)

arXiv.org Artificial Intelligence

2510.17394

Country:

Asia > Middle East > UAE (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Automated Video-EEG Analysis in Epilepsy Studies: Advances and Challenges

Zuev, Valerii A., Salmagambetova, Elena G., Djakov, Stepan N., Utkin, Lev V.

arXiv.org Artificial IntelligenceOct-21-2025

Epilepsy is typically diagnosed through electroencephalography (EEG) and long-term video-EEG (vEEG) monitoring. The manual analysis of vEEG recordings is time-consuming, necessitating automated tools for seizure detection. Recent advancements in machine learning have shown promise in real-time seizure detection and prediction using EEG and video data. However, diversity of seizure symptoms, markup ambiguities, and limited availability of multimodal datasets hinder progress. This paper reviews the latest developments in automated video-EEG analysis and discusses the integration of multimodal data. We also propose a novel pipeline for treatment effect estimation from vEEG data using concept-based learning, offering a pathway for future research in this domain.

data mining, machine learning, natural language, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10916-025-02274-0

2503.19949

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Epilepsy (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Add feedback

Dynamic Recalibration in LiDAR SLAM: Integrating AI and Geometric Methods with Real-Time Feedback Using INAF Fusion

Arjmandi, Zahra, Sohn, Gunho

arXiv.org Artificial IntelligenceOct-20-2025

This paper presents a novel fusion technique for LiDAR Simultaneous Localization and Mapping (SLAM), aimed at improving localization and 3D mapping using LiDAR sensor. Our approach centers on the Inferred Attention Fusion (INAF) module, which integrates AI with geometric odometry. Utilizing the KITTI dataset's LiDAR data, INAF dynamically adjusts attention weights based on environmental feedback, enhancing the system's adaptability and measurement accuracy. This method advances the precision of both localization and 3D mapping, demonstrating the potential of our fusion technique to enhance autonomous navigation systems in complex scenarios.

artificial intelligence, information fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.15803

Country: North America > Canada (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(3 more...)

Add feedback

LVI-Q: Robust LiDAR-Visual-Inertial-Kinematic Odometry for Quadruped Robots Using Tightly-Coupled and Efficient Alternating Optimization

Marsim, Kevin Christiansen, Oh, Minho, Yu, Byeongho, Lee, Seungjae, Nahrendra, I Made Aswin, Lim, Hyungtae, Myung, Hyun

arXiv.org Artificial IntelligenceOct-20-2025

Autonomous navigation for legged robots in complex and dynamic environments relies on robust simultaneous localization and mapping (SLAM) systems to accurately map surroundings and localize the robot, ensuring safe and efficient operation. While prior sensor fusion-based SLAM approaches have integrated various sensor modalities to improve their robustness, these algorithms are still susceptible to estimation drift in challenging environments due to their reliance on unsuitable fusion strategies. Therefore, we propose a robust LiDAR-visual-inertial-kinematic odometry system that integrates information from multiple sensors, such as a camera, LiDAR, inertial measurement unit (IMU), and joint encoders, for visual and LiDAR-based odometry estimation. Our system employs a fusion-based pose estimation approach that runs optimization-based visual-inertial-kinematic odometry (VIKO) and filter-based LiDAR-inertial-kinematic odometry (LIKO) based on measurement availability. In VIKO, we utilize the footpreintegration technique and robust LiDAR-visual depth consistency using superpixel clusters in a sliding window optimization. In LIKO, we incorporate foot kinematics and employ a point-toplane residual in an error-state iterative Kalman filter (ESIKF). Compared with other sensor fusion-based SLAM algorithms, our approach shows robust performance across public and longterm datasets.

artificial intelligence, estimation, information fusion, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3598661

2510.1522

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.74)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.72)

Add feedback

deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss

Castrillon-Candas, Julio Enrique, Gu, Hanfeng, Meredith, Caleb, Li, Yulin, Tang, Xiaojing, Olofsson, Pontus, Kon, Mark

arXiv.org Machine LearningOct-17-2025

In this paper we develop a deforestation detection pipeline that incorporates optical and Synthetic Aperture Radar (SAR) data. A crucial component of the pipeline is the construction of anomaly maps of the optical data, which is done using the residual space of a discrete Karhunen-Loève (KL) expansion. Anomalies are quantified using a concentration bound on the distribution of the residual components for the nominal state of the forest. This bound does not require prior knowledge on the distribution of the data. This is in contrast to statistical parametric methods that assume knowledge of the data distribution, an impractical assumption that is especially infeasible for high dimensional data such as ours. Once the optical anomaly maps are computed they are combined with SAR data, and the state of the forest is classified by using a Hidden Markov Model (HMM). We test our approach with Sentinel-1 (SAR) and Sentinel-2 (Optical) data on a $92.19\,km \times 91.80\,km$ region in the Amazon forest. The results show that both the hybrid optical-radar and optical only methods achieve high accuracy that is superior to the recent state-of-the-art hybrid method. Moreover, the hybrid method is significantly more robust in the case of sparse optical data that are common in highly cloudy regions.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2510.14092

Country:

Asia > Southeast Asia (0.04)
South America > Brazil (0.04)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (0.71)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Add feedback

Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision

Han, Xiaofeng, Chen, Shunpeng, Fu, Zenghuang, Feng, Zhe, Fan, Lue, An, Dong, Wang, Changwei, Guo, Li, Meng, Weiliang, Zhang, Xiaopeng, Xu, Rongtao, Xu, Shibiao

arXiv.org Artificial IntelligenceOct-16-2025

Robot vision has greatly benefited from advancements in multimodal fusion techniques and vision-language models (VLMs). We adopt a task-oriented perspective to systematically review the applications and advancements of multimodal fusion methods and VLMs in the field of robot vision. For semantic scene understanding tasks, we categorize fusion approaches into encoder-decoder frameworks, attention-based architectures, and graph neural networks. Meanwhile, we also analyze the architectural characteristics and practical implementations of these fusion strategies in key tasks such as simultaneous localization and mapping (SLAM), 3D object detection, navigation, and manipulation. We compare the evolutionary paths and applicability of VLMs based on large language models (LLMs) with traditional multimodal fusion methods.Additionally, we conduct an in-depth analysis of commonly used datasets, evaluating their applicability and challenges in real-world robotic scenarios. Building on this analysis, we identify key challenges in current research, including cross-modal alignment, efficient fusion, real-time deployment, and domain adaptation. We propose future directions such as self-supervised learning for robust multimodal representations, structured spatial memory and environment modeling to enhance spatial intelligence, and the integration of adversarial robustness and human feedback mechanisms to enable ethically aligned system deployment. Through a comprehensive review, comparative analysis, and forward-looking discussion, we provide a valuable reference for advancing multimodal perception and interaction in robotic vision. A comprehensive list of studies in this survey is available at https://github.com/Xiaofeng-Han-Res/MF-RV.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.inffus.2025.103652

2504.02477

Country:

Europe (0.67)
North America > United States > Minnesota (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)
Research Report > New Finding (0.45)

Industry:

Energy (1.00)
Automobiles & Trucks (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.92)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(3 more...)

Add feedback

Classifier-Augmented Generation for Structured Workflow Prediction

Gschwind, Thomas, Chakraborty, Shramona, Gupta, Nitin, Mehta, Sameep

arXiv.org Artificial IntelligenceOct-16-2025

ETL (Extract, Transform, Load) tools such as IBM DataStage allow users to visually assemble complex data workflows, but configuring stages and their properties remains time consuming and requires deep tool knowledge. We propose a system that translates natural language descriptions into executable workflows, automatically predicting both the structure and detailed configuration of the flow. At its core lies a Classifier-Augmented Generation (CAG) approach that combines utterance decomposition with a classifier and stage-specific few-shot prompting to produce accurate stage predictions. These stages are then connected into non-linear workflows using edge prediction, and stage properties are inferred from sub-utterance context. We compare CAG against strong single-prompt and agentic baselines, showing improved accuracy and efficiency, while substantially reducing token usage. Our architecture is modular, interpretable, and capable of end-to-end workflow generation, including robust validation steps. To our knowledge, this is the first system with a detailed evaluation across stage prediction, edge layout, and property generation for natural-language-driven ETL authoring.

large language model, machine learning, utterance, (20 more...)

arXiv.org Artificial Intelligence

2510.12825

Genre: Workflow (1.00)

Industry: Information Technology (0.37)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Knowledge Fusion via Bidirectional Information Aggregation

Zhai, Songlin, Qi, Guilin, Wang, Yue, Meng, Yuan

arXiv.org Artificial IntelligenceOct-15-2025

Knowledge graphs (KGs) are the cornerstone of the semantic web, offering up-to-date representations of real-world entities and relations. Yet large language models (LLMs) remain largely static after pre-training, causing their internal knowledge to become outdated and limiting their utility in time-sensitive web applications. To bridge this gap between dynamic knowledge and static models, a prevalent approach is to enhance LLMs with KGs. However, prevailing methods typically rely on parameter-invasive fine-tuning, which risks catastrophic forgetting and often degrades LLMs' general capabilities. Moreover, their static integration frameworks cannot keep pace with the continuous evolution of real-world KGs, hindering their deployment in dynamic web environments. To bridge this gap, we introduce KGA (\textit{\underline{K}nowledge \underline{G}raph-guided \underline{A}ttention}), a novel framework that dynamically integrates external KGs into LLMs exclusively at inference-time without any parameter modification. Inspired by research on neuroscience, we rewire the self-attention module by innovatively introducing two synergistic pathways: a \textit{bottom-up knowledge fusion} pathway and a \textit{top-down attention guidance} pathway. The \textit{bottom-up pathway} dynamically integrates external knowledge into input representations via input-driven KG fusion, which is akin to the \textit{stimulus-driven attention process} in the human brain. Complementarily, the \textit{top-down pathway} aims to assess the contextual relevance of each triple through a \textit{goal-directed verification process}, thereby suppressing task-irrelevant signals and amplifying knowledge-relevant patterns. By synergistically combining these two pathways, our method supports real-time knowledge fusion. Extensive experiments on four benchmarks verify KGA's strong fusion performance and efficiency.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.08704

Country:

Europe (1.00)
Asia > Middle East > UAE (0.28)
North America > United States > Texas (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Galilean Symmetry in Robotics

Mahony, Robert, Kelly, Jonathan, Weiss, Stephan

arXiv.org Artificial IntelligenceOct-14-2025

Galilean symmetry is the natural symmetry of inertial motion that underpins Newtonian physics. Although rigid-body symmetry is one of the most established and fundamental tools in robotics, there appears to be no comparable treatment of Galilean symmetry for a robotics audience. In this paper, we present a robotics-tailored exposition of Galilean symmetry that leverages the community's familiarity with and understanding of rigid-body transformations and pose representations. Our approach contrasts with common treatments in the physics literature that introduce Galilean symmetry as a stepping stone to Einstein's relativity. A key insight is that the Galilean matrix Lie group can be used to describe two different pose representations, Galilean frames, that use inertial velocity in the state definition, and extended poses, that use coordinate velocity. We provide three examples where applying the Galilean matrix Lie-group algebra to robotics problems is straightforward and yields significant insights: inertial navigation above the rotating Earth, manipulator kinematics, and sensor data fusion under temporal uncertainty. We believe that the time is right for the robotics community to benefit from rediscovering and extending this classical material and applying it to modern problems.

artificial intelligence, information fusion, reference frame, (15 more...)

arXiv.org Artificial Intelligence

2510.10468

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Add feedback