AITopics | Information Fusion

Collaborating Authors

Information Fusion

News Overviews Instructional Materials AI-Alerts Classics

Blockbuster, Part 1: Block-level AI Operator Fusion

arXiv.org Artificial IntelligenceMay-14-2025

Blockbuster is a framework for AI operator fusion in inference programs. The Blockbuster framework is compatible with any multiprocessor architecture that has a tiered memory hierarchy, including GPUs, multi-core CPUs, and some AI accelerator chips. It includes a graph-based representation for AI workloads, called a block program, which explicitly models how blocks of data move between the memory tiers. It also includes an operator fusion procedure, which is made up of a candidate selection algorithm and a fusion algorithm that fuses each individual candidate - this two-algorithm structure makes Blockbuster especially suitable for large AI programs. The current paper focuses on the fusion algorithm, which is a rule-based technique. While the literature is full of previous rule-based fusion algorithms, what sets our algorithm apart is its direct modeling of data movement between memory tiers, resulting in uniquely powerful fusion results. As a first sanity check, we demonstrate how our algorithm automatically rediscovers the well-known Flash Attention kernel. Then, we demonstrate the real power of our approach by fusing LayerNorm with matrix multiplication and RMSNorm with FNN-SwiGLU - the latter involves fusing three matrix multiplications, a Hadamard product, a reduction, and a few elementwise operations into a single mega-kernel.

artificial intelligence, information fusion, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.07829

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset

Wysocki, Olaf, Schwab, Benedikt, Biswanath, Manoj Kumar, Greza, Michael, Zhang, Qilin, Zhu, Jingwei, Froech, Thomas, Heeramaglore, Medhini, Hijazi, Ihab, Kanna, Khaoula, Pechinger, Mathias, Chen, Zhaiyu, Sun, Yao, Segura, Alejandro Rueda, Xu, Ziyang, AbdelGafar, Omar, Mehranfar, Mansour, Yeshwanth, Chandan, Liu, Yueh-Cheng, Yazdi, Hadi, Wang, Jiapan, Auer, Stefan, Anders, Katharina, Bogenberger, Klaus, Borrmann, Andre, Dai, Angela, Hoegner, Ludwig, Holst, Christoph, Kolbe, Thomas H., Ludwig, Ferdinand, Nießner, Matthias, Petzold, Frank, Zhu, Xiao Xiang, Jutzi, Boris

arXiv.org Artificial IntelligenceMay-14-2025

Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models' updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive UDTs validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 $m^2$ and currently 767 GB of data. By ensuring georeferenced indoor-outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: https://tum2t.win

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.07396

Country:

North America > United States (0.28)
Europe > Germany > Bavaria (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Energy > Renewable (1.00)
Government (0.93)
Information Technology (0.93)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

Modular Federated Learning: A Meta-Framework Perspective

Vicente, Frederico, Soares, Cláudia, Jakovetić, Dušan

arXiv.org Artificial IntelligenceMay-14-2025

Federated Learning (FL) enables distributed machine learning training while preserving privacy, representing a paradigm shift for data-sensitive and decentralized environments. Despite its rapid advancements, FL remains a complex and multifaceted field, requiring a structured understanding of its methodologies, challenges, and applications. In this survey, we introduce a meta-framework perspective, conceptualising FL as a composition of modular components that systematically address core aspects such as communication, optimisation, security, and privacy. We provide a historical contextualisation of FL, tracing its evolution from distributed optimisation to modern distributed learning paradigms. Additionally, we propose a novel taxonomy distinguishing Aggregation from Alignment, introducing the concept of alignment as a fundamental operator alongside aggregation. To bridge theory with practice, we explore available FL frameworks in Python, facilitating real-world implementation. Finally, we systematise key challenges across FL sub-fields, providing insights into open research questions throughout the meta-framework modules. By structuring FL within a meta-framework of modular components and emphasising the dual role of Aggregation and Alignment, this survey provides a holistic and adaptable foundation for understanding and advancing FL research and deployment.

evolutionary algorithm, machine learning, neural information processing system, (16 more...)

arXiv.org Artificial Intelligence

2505.08646

Country:

Europe (1.00)
North America > United States (0.68)
Asia (0.67)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.33)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(6 more...)

Add feedback

A Pain Assessment Framework based on multimodal data and Deep Machine Learning methods

Gkikas, Stefanos

arXiv.org Artificial IntelligenceMay-13-2025

From the original abstract: This thesis initially aims to study the pain assessment process from a clinical-theoretical perspective while exploring and examining existing automatic approaches. Building on this foundation, the primary objective of this Ph.D. project is to develop innovative computational methods for automatic pain assessment that achieve high performance and are applicable in real clinical settings. A primary goal is to thoroughly investigate and assess significant factors, including demographic elements that impact pain perception, as recognized in pain research, through a computational standpoint. Within the limits of the available data in this research area, our goal was to design, develop, propose, and offer automatic pain assessment pipelines for unimodal and multimodal configurations that are applicable to the specific requirements of different scenarios. The studies published in this Ph.D. thesis showcased the effectiveness of the proposed methods, achieving state-of-the-art results. Additionally, they paved the way for exploring new approaches in artificial intelligence, foundation models, and generative artificial intelligence.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2505.05396

Country:

Europe > Switzerland (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)
(12 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(9 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
(6 more...)

Add feedback

NSF-MAP: Neurosymbolic Multimodal Fusion for Robust and Interpretable Anomaly Prediction in Assembly Pipelines

Shyalika, Chathurangi, Prasad, Renjith, Kalach, Fadi El, Venkataramanan, Revathy, Zand, Ramtin, Harik, Ramy, Sheth, Amit

arXiv.org Artificial IntelligenceMay-13-2025

In modern assembly pipelines, identifying anomalies is crucial in ensuring product quality and operational efficiency. Conventional single-modality methods fail to capture the intricate relationships required for precise anomaly prediction in complex predictive environments with abundant data and multiple modalities. This paper proposes a neurosymbolic AI and fusion-based approach for multimodal anomaly prediction in assembly pipelines. We introduce a time series and image-based fusion model that leverages decision-level fusion techniques. Our research builds upon three primary novel approaches in multimodal learning: time series and image-based decision-level fusion modeling, transfer learning for fusion, and knowledge-infused learning. We evaluate the novel method using our derived and publicly available multimodal dataset and conduct comprehensive ablation studies to assess the impact of our preprocessing techniques and fusion model compared to traditional baselines. The results demonstrate that a neurosymbolic AI-based fusion approach that uses transfer learning can effectively harness the complementary strengths of time series and image data, offering a robust and interpretable approach for anomaly prediction in assembly pipelines with enhanced performance. \noindent The datasets, codes to reproduce the results, supplementary materials, and demo are available at https://github.com/ChathurangiShyalika/NSF-MAP.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2505.06333

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Nature's Insight: A Novel Framework and Comprehensive Analysis of Agentic Reasoning Through the Lens of Neuroscience

Liu, Zinan, Li, Haoran, Lu, Jingyi, Ma, Gaoyuan, Hong, Xu, Iacca, Giovanni, Kumar, Arvind, Tang, Shaojun, Wang, Lin

arXiv.org Artificial IntelligenceMay-12-2025

Autonomous AI is no longer a hard-to-reach concept, it enables the agents to move beyond executing tasks to independently addressing complex problems, adapting to change while handling the uncertainty of the environment. However, what makes the agents truly autonomous? It is agentic reasoning, that is crucial for foundation models to develop symbolic logic, statistical correlations, or large-scale pattern recognition to process information, draw inferences, and make decisions. However, it remains unclear why and how existing agentic reasoning approaches work, in comparison to biological reasoning, which instead is deeply rooted in neural mechanisms involving hierarchical cognition, multimodal integration, and dynamic interactions. In this work, we propose a novel neuroscience-inspired framework for agentic reasoning. Grounded in three neuroscience-based definitions and supported by mathematical and biological foundations, we propose a unified framework modeling reasoning from perception to action, encompassing four core types, perceptual, dimensional, logical, and interactive, inspired by distinct functional roles observed in the human brain. We apply this framework to systematically classify and analyze existing AI reasoning methods, evaluating their theoretical foundations, computational designs, and practical limitations. We also explore its implications for building more generalizable, cognitively aligned agents in physical and virtual environments. Finally, building on our framework, we outline future directions and propose new neural-inspired reasoning methods, analogous to chain-of-thought prompting. By bridging cognitive neuroscience and AI, this work offers a theoretical foundation and practical roadmap for advancing agentic reasoning in intelligent systems. The associated project can be found at: https://github.com/BioRAILab/Awesome-Neuroscience-Agent-Reasoning .

large language model, pattern recognition, simulation of human behavior, (25 more...)

arXiv.org Artificial Intelligence

2505.05515

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > United Kingdom > England (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(12 more...)

Add feedback

Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking

Sadeghi, Alireza, Hajati, Farshid, Argha, Ahmadreza, Lovell, Nigel H, Yang, Min, Alinejad-Rokny, Hamid

arXiv.org Artificial IntelligenceMay-6-2025

Integrating heterogeneous biomedical data including imaging, omics, and clinical records supports accurate diagnosis and personalised care. Graph-based models fuse such non-Euclidean data by capturing spatial and relational structure, yet clinical uptake requires regulator-ready interpretability. We present the first technical survey of interpretable graph based models for multimodal biomedical data, covering 26 studies published between Jan 2019 and Sep 2024. Most target disease classification, notably cancer and rely on static graphs from simple similarity measures, while graph-native explainers are rare; post-hoc methods adapted from non-graph domains such as gradient saliency, and SHAP predominate. We group existing approaches into four interpretability families, outline trends such as graph-in-graph hierarchies, knowledge-graph edges, and dynamic topology learning, and perform a practical benchmark. Using an Alzheimer disease cohort, we compare Sensitivity Analysis, Gradient Saliency, SHAP and Graph Masking. SHAP and Sensitivity Analysis recover the broadest set of known AD pathways and Gene-Ontology terms, whereas Gradient Saliency and Graph Masking surface complementary metabolic and transport signatures. Permutation tests show all four beat random gene sets, but with distinct trade-offs: SHAP and Graph Masking offer deeper biology at higher compute cost, while Gradient Saliency and Sensitivity Analysis are quicker though coarser. We also provide a step-by-step flowchart covering graph construction, explainer choice and resource budgeting to help researchers balance transparency and performance. This review synthesises the state of interpretable graph learning for multimodal medicine, benchmarks leading techniques, and charts future directions, from advanced XAI tools to under-studied diseases, serving as a concise reference for method developers and translational scientists.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.01696

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multimodal and Multiview Deep Fusion for Autonomous Marine Navigation

Dagdilelis, Dimitrios, Grigoriadis, Panagiotis, Galeazzi, Roberto

arXiv.org Artificial IntelligenceMay-6-2025

We propose a cross attention transformer based method for multimodal sensor fusion to build a birds eye view of a vessels surroundings supporting safer autonomous marine navigation. The model deeply fuses multiview RGB and long wave infrared images with sparse LiDAR point clouds. Training also integrates X band radar and electronic chart data to inform predictions. The resulting view provides a detailed reliable scene representation improving navigational accuracy and robustness. Real world sea trials confirm the methods effectiveness even in adverse weather and complex maritime settings.

detection, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.01615

Genre: Research Report (1.00)

Industry: Transportation (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(3 more...)

Add feedback

Analysis of the Unscented Transform for Cooperative Localization with Ranging-Only Information

Olawoye, Uthman, Kilic, Cagri, Gross, Jason N

arXiv.org Artificial IntelligenceMay-6-2025

--Cooperative localization in multi-agent robotic systems is challenging, especially when agents rely on limited information, such as only peer-to-peer range measurements. Two key challenges arise: utilizing this limited information to improve position estimation; handling uncertainties from sensor noise, nonlinearity, and unknown correlations between agents' measurements; and avoiding information reuse. This paper examines the use of the Unscented Transform (UT) for state estimation for a case in which range measurement between agents and covariance intersection (CI) is used to handle unknown correlations. This makes formulating a CI approach with ranging-only measurements a challenge. T o overcome this, UT is used to handle uncertainties and formulate a cooperative state update using range measurements and current cooperative state estimates. This introduces information reuse in the measurement update. Therefore, this work aims to evaluate the limitations and utility of this formulation when faced with various levels of state measurement uncertainty and errors. Cooperative localization has emerged as a viable strategy for increasing the accuracy and resilience of multi-robot systems' localization [1].

artificial intelligence, information fusion, localization, (15 more...)

arXiv.org Artificial Intelligence

2504.07242

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.93)

Add feedback

Fault-Tolerant Multi-Modal Localization of Multi-Robots on Matrix Lie Groups

Zarei, Mahboubeh, Chhabra, Robin

arXiv.org Artificial IntelligenceMay-5-2025

Consistent localization of cooperative multi-robot systems during navigation presents substantial challenges. This paper proposes a fault-tolerant, multi-modal localization framework for multi-robot systems on matrix Lie groups. We introduce novel stochastic operations to perform composition, differencing, inversion, averaging, and fusion of correlated and non-correlated estimates on Lie groups, enabling pseudo-pose construction for filter updates. The method integrates a combination of proprioceptive and exteroceptive measurements from inertial, velocity, and pose (pseudo-pose) sensors on each robot in an Extended Kalman Filter (EKF) framework. The prediction step is conducted on the Lie group $\mathbb{SE}_2(3) \times \mathbb{R}^3 \times \mathbb{R}^3$, where each robot's pose, velocity, and inertial measurement biases are propagated. The proposed framework uses body velocity, relative pose measurements from fiducial markers, and inter-robot communication to provide scalable EKF update across the network on the Lie group $\mathbb{SE}(3) \times \mathbb{R}^3$. A fault detection module is implemented, allowing the integration of only reliable pseudo-pose measurements from fiducial markers. We demonstrate the effectiveness of the method through experiments with a network of wheeled mobile robots equipped with inertial measurement units, wheel odometry, and ArUco markers. The comparison results highlight the proposed method's real-time performance, superior efficiency, reliability, and scalability in multi-robot localization, making it well-suited for large-scale robotic systems.

artificial intelligence, information fusion, robot, (15 more...)

arXiv.org Artificial Intelligence

2505.00842

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)

Add feedback