Information Fusion
Where is the Boundary: Multimodal Sensor Fusion Test Bench for Tissue Boundary Delineation
Chen, Zacharias, Cahilig, Alexa Cristelle, Dias, Sarah, Kolar, Prithu, Prakash, Ravi, Codd, Patrick J.
Robot-assisted neurological surgery is receiving growing interest due to the improved dexterity, precision, and control of surgical tools, which results in better patient outcomes. However, such systems often limit surgeons' natural sensory feedback, which is crucial in identifying tissues -- particularly in oncological procedures where distinguishing between healthy and tumorous tissue is vital. While imaging and force sensing have addressed the lack of sensory feedback, limited research has explored multimodal sensing options for accurate tissue boundary delineation. We present a user-friendly, modular test bench designed to evaluate and integrate complementary multimodal sensors for tissue identification. Our proposed system first uses vision-based guidance to estimate boundary locations with visual cues, which are then refined using data acquired by contact microphones and a force sensor. Real-time data acquisition and visualization are supported via an interactive graphical interface. Experimental results demonstrate that multimodal fusion significantly improves material classification accuracy. The platform provides a scalable hardware-software solution for exploring sensor fusion in surgical applications and demonstrates the potential of multimodal approaches in real-time tissue boundary delineation.
MDD-Net: Multimodal Depression Detection through Mutual Transformer
Haque, Md Rezwanul, Islam, Md. Milon, Raju, S M Taslim Uddin, Altaheri, Hamdi, Nassar, Lobna, Karray, Fakhri
--Depression is a major mental health condition that severely impacts the emotional and physical well-being of individuals. The simple nature of data collection from social media platforms has attracted significant interest in properly utilizing this information for mental health research. A Multimodal Depression Detection Network (MDD-Net), utilizing acoustic and visual data obtained from social media networks, is proposed in this work where mutual transformers are exploited to efficiently extract and fuse multimodal features for efficient depression detection. The MDD-Net consists of four core modules: an acoustic feature extraction module for retrieving relevant acoustic attributes, a visual feature extraction module for extracting significant high-level patterns, a mutual transformer for computing the correlations among the generated features and fusing these features from multiple modalities, and a detection layer for detecting depression using the fused feature representations. The extensive experiments are performed using the multimodal D-Vlog dataset, and the findings reveal that the developed multimodal depression detection network surpasses the state-of-the-art by up to 17.37% for F1-Score, demonstrating the greater performance of the proposed system. The source code is accessible at https://github. Depression is a serious psychological condition that distorts a person's mood, thoughts, and behavior.
FNBT: Full Negation Belief Transformation for Open-World Information Fusion Based on Dempster-Shafer Theory of Evidence
He, Meishen, Ma, Wenjun, Wang, Jiao, Yue, Huijun, Fan, Xiaoma
The Dempster-Shafer theory of evidence has been widely applied in the field of information fusion under uncertainty. Most existing research focuses on combining evidence within the same frame of discernment. However, in real-world scenarios, trained algorithms or data often originate from different regions or organizations, where data silos are prevalent. As a result, using different data sources or models to generate basic probability assignments may lead to heterogeneous frames, for which traditional fusion methods often yield unsatisfactory results. To address this challenge, this study proposes an open-world information fusion method, termed Full Negation Belief Transformation (FNBT), based on the Dempster-Shafer theory. More specially, a criterion is introduced to determine whether a given fusion task belongs to the open-world setting. Then, by extending the frames, the method can accommodate elements from heterogeneous frames. Finally, a full negation mechanism is employed to transform the mass functions, so that existing combination rules can be applied to the transformed mass functions for such information fusion. Theoretically, the proposed method satisfies three desirable properties, which are formally proven: mass function invariance, heritability, and essential conflict elimination. Empirically, FNBT demonstrates superior performance in pattern classification tasks on real-world datasets and successfully resolves Zadeh's counterexample, thereby validating its practical effectiveness.
Multimodal AI Systems for Enhanced Laying Hen Welfare Assessment and Productivity Optimization
Essien, Daniel, Neethirajan, Suresh
The future of poultry production depends on a paradigm shift replacing subjective, labor-intensive welfare checks with data-driven, intelligent monitoring ecosystems. Traditional welfare assessments-limited by human observation and single-sensor data-cannot fully capture the complex, multidimensional nature of laying hen welfare in modern farms. Multimodal Artificial Intelligence (AI) offers a breakthrough, integrating visual, acoustic, environmental, and physiological data streams to reveal deeper insights into avian welfare dynamics. This investigation highlights multimodal As transformative potential, showing that intermediate (feature-level) fusion strategies achieve the best balance between robustness and performance under real-world poultry conditions, and offer greater scalability than early or late fusion approaches. Key adoption barriers include sensor fragility in harsh farm environments, high deployment costs, inconsistent behavioral definitions, and limited cross-farm generalizability. To address these, we introduce two novel evaluation tools - the Domain Transfer Score (DTS) to measure model adaptability across diverse farm settings, and the Data Reliability Index (DRI) to assess sensor data quality under operational constraints. We also propose a modular, context-aware deployment framework designed for laying hen environments, enabling scalable and practical integration of multimodal sensing. This work lays the foundation for a transition from reactive, unimodal monitoring to proactive, precision-driven welfare systems that unite productivity with ethical, science based animal care.
Intersectoral Knowledge in AI and Urban Studies: A Framework for Transdisciplinary Research
Transdisciplinary approaches are increasingly essential for addressing grand societal challenges, particularly in complex domains such as Artificial Intelligence (AI), urban planning, and social sciences. However, effectively validating and integrating knowledge across distinct epistemic and ontological perspectives poses significant difficulties. This article proposes a six-dimensional framework for assessing and strengthening transdisciplinary knowledge validity in AI and city studies, based on an extensive analysis of the most cited research (2014--2024). Specifically, the framework classifies research orientations according to ontological, epistemological, methodological, teleological, axiological, and valorization dimensions. Our findings show a predominance of perspectives aligned with critical realism (ontological), positivism (epistemological), analytical methods (methodological), consequentialism (teleological), epistemic values (axiological), and social/economic valorization. Less common stances, such as idealism, mixed methods, and cultural valorization, are also examined for their potential to enrich knowledge production. We highlight how early career researchers and transdisciplinary teams can leverage this framework to reconcile divergent disciplinary viewpoints and promote socially accountable outcomes.
Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey
Acharya, Kamal, Sharifi, Iman, Lad, Mehul, Sun, Liang, Song, Houbing
Neurosymbolic AI combines neural network adaptability with symbolic reasoning, promising an approach to address the complex regulatory, operational, and safety challenges in Advanced Air Mobility (AAM). This survey reviews its applications across key AAM domains such as demand forecasting, aircraft design, and real-time air traffic management. Our analysis reveals a fragmented research landscape where methodologies, including Neurosymbolic Reinforcement Learning, have shown potential for dynamic optimization but still face hurdles in scalability, robustness, and compliance with aviation standards. We classify current advancements, present relevant case studies, and outline future research directions aimed at integrating these approaches into reliable, transparent AAM systems. By linking advanced AI techniques with AAM's operational demands, this work provides a concise roadmap for researchers and practitioners developing next-generation air mobility solutions.
Fusing Cross-Domain Knowledge from Multimodal Data to Solve Problems in the Physical World
The proliferation of artificial intelligence has enabled a diversity of applications that bridge the gap between digital and physical worlds. As physical environments are too complex to model through a single information acquisition approach, it is crucial to fuse multimodal data generated by different sources, such as sensors, devices, systems, and people, to solve a problem in the real world. Unfortunately, it is neither applicable nor sustainable to deploy new resources to collect original data from scratch for every problem. Thus, when data is inadequate in the domain of problem, it is vital to fuse knowledge from multimodal data that is already available in other domains. We call this cross-domain knowledge fusion. Existing research focus on fusing multimodal data in a single domain, supposing the knowledge from different datasets is intrinsically aligned; however, this assumption may not hold in the scenarios of cross-domain knowledge fusion. In this paper, we formally define the cross-domain multimodal data fusion problem, discussing its unique challenges, differences and advantages beyond data fusion in a single domain. We propose a four-layer framework, consisting of Domains, Links, Models and Data layers, answering three key questions:"what to fuse", "why can be fused", and "how to fuse". The Domains Layer selects relevant data from different domains for a given problem. The Links Layer reveals the philosophy of knowledge alignment beyond specific model structures. The Models Layer provides two knowledge fusion paradigms based on the fundamental mechanisms for processing data. The Data Layer turns data of different structures, resolutions, scales and distributions into a consistent representation that can be fed into an AI model. With this framework, we can design solutions that fuse cross-domain multimodal data effectively for solving real-world problems.
A Multi-view Landmark Representation Approach with Application to GNSS-Visual-Inertial Odometry
Hua, Tong, Han, Jiale, Ouyang, Wei
Invariant Extended Kalman Filter (IEKF) has been a significant technique in vision-aided sensor fusion. However, it usually suffers from high computational burden when jointly optimizing camera poses and the landmarks. To improve its efficiency and applicability for multi-sensor fusion, we present a multi-view pose-only estimation approach with its application to GNSS-Visual-Inertial Odometry (GVIO) in this paper. Our main contribution is deriving a visual measurement model which directly associates landmark representation with multiple camera poses and observations. Such a pose-only measurement is proven to be tightly-coupled between landmarks and poses, and maintain a perfect null space that is independent of estimated poses. Finally, we apply the proposed approach to a filter based GVIO with a novel feature management strategy. Both simulation tests and real-world experiments are conducted to demonstrate the superiority of the proposed method in terms of efficiency and accuracy.