AITopics | fusion framework

Collaborating Authors

fusion framework

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cross-Domain Malware Detection via Probability-Level Fusion of Lightweight Gradient Boosting Models

Mohamed, Omar Khalid Ali

arXiv.org Artificial IntelligenceSep-3-2025

The escalating sophistication of malware necessitates robust detection mechanisms that generalize across diverse data sources. Traditional single-dataset models struggle with cross-domain generalization and often incur high computational costs. This paper presents a novel, lightweight framework for malware detection that employs probability-level fusion across three distinct datasets: EMBER (static features), API Call Sequences (behavioral features), and CIC Obfuscated Memory (memory patterns). Our method trains individual LightGBM classifiers on each dataset, selects top predictive features to ensure efficiency, and fuses their prediction probabilities using optimized weights determined via grid search. Extensive experiments demonstrate that our fusion approach achieves a macro F1-score of 0.823 on a cross-domain validation set, significantly outperforming individual models and providing superior generalization. The framework maintains low computational overhead, making it suitable for real-time deployment, and all code and data are provided for full reproducibility.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.00476

Country: Asia > Middle East > Saudi Arabia (0.14)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.41)

Add feedback

Multi-Modal Data Fusion for Moisture Content Prediction in Apple Drying

Li, Shichen, Shao, Chenhui

arXiv.org Artificial IntelligenceApr-11-2025

Fruit drying is widely used in food manufacturing to reduce product moisture, ensure product safety, and extend product shelf life. Accurately predicting final moisture content (MC) is critically needed for quality control of drying processes. State-of-the-art methods can build deterministic relationships between process parameters and MC, but cannot adequately account for inherent process variabilities that are ubiquitous in fruit drying. To address this gap, this paper presents a novel multi-modal data fusion framework to effectively fuse two modalities of data: tabular data (process parameters) and high-dimensional image data (images of dried apple slices) to enable accurate MC prediction. The proposed modeling architecture permits flexible adjustment of information portion from tabular and image data modalities. Experimental validation shows that the multi-modal approach improves predictive accuracy substantially compared to state-of-the-art methods. The proposed method reduces root-mean-squared errors by 19.3%, 24.2%, and 15.2% over tabular-only, image-only, and standard tabular-image fusion models, respectively. Furthermore, it is demonstrated that our method is robust in varied tabular-image ratios and capable of effectively capturing inherent small-scale process variabilities. The proposed framework is extensible to a variety of other drying technologies.

artificial intelligence, image data, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.07465

Country: North America > United States > Michigan (0.28)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.46)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Edge-Enhanced Dilated Residual Attention Network for Multimodal Medical Image Fusion

Zhou, Meng, Zhang, Yuxuan, Xu, Xiaolan, Wang, Jiayi, Khalvati, Farzad

arXiv.org Artificial IntelligenceNov-18-2024

Multimodal medical image fusion is a crucial task that combines complementary information from different imaging modalities into a unified representation, thereby enhancing diagnostic accuracy and treatment planning. While deep learning methods, particularly Convolutional Neural Networks (CNNs) and Transformers, have significantly advanced fusion performance, some of the existing CNNbased methods fall short in capturing fine-grained multiscale and edge features, leading to suboptimal feature integration. Transformer-based models, on the other hand, are computationally intensive in both the training and fusion stages, making them impractical for real-time clinical use. Moreover, the clinical application of fused images remains unexplored. In this paper, we propose a novel CNN-based architecture that addresses these limitations by introducing a Dilated Residual Attention Network Module for effective multiscale feature extraction, coupled with a gradient operator to enhance edge detail learning. To ensure fast and efficient fusion, we present a parameter-free fusion strategy based on the weighted nuclear norm of softmax, which requires no additional computations during training or inference. Extensive experiments, including a downstream brain tumor classification task, demonstrate that our approach outperforms various baseline methods in terms of visual quality, texture preservation, and fusion speed, making it a possible practical solution for real-world clinical applications. Medical imaging plays an increasingly prominent role in clinical diagnosis, it aims to aggregate common and complementary information from different image modalities as well as integrate the information to generate more clearer images (Xie et al., 2023). Medical image fusion can enhance crucial details of anatomy and tissue information from different image modalities and hence helps physicians and radiologists in accurate diagnosis of diseases, e.g., precise localization of tumor boundaries and tissues (Chen et al., 2024) and effective radiotherapy treatments (Safari et al., 2023; Xie et al., 2023).

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.11799

Country:

North America > Canada > Quebec > Montreal (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unleash LLMs Potential for Recommendation by Coordinating Twin-Tower Dynamic Semantic Token Generator

Yin, Jun, Zeng, Zhengxin, Li, Mingzheng, Yan, Hao, Li, Chaozhuo, Han, Weihao, Zhang, Jianjin, Liu, Ruochen, Sun, Allen, Deng, Denvy, Sun, Feng, Zhang, Qi, Pan, Shirui, Wang, Senzhang

arXiv.org Artificial IntelligenceSep-13-2024

Owing to the unprecedented capability in semantic understanding and logical reasoning, the pre-trained large language models (LLMs) have shown fantastic potential in developing the next-generation recommender systems (RSs). However, the static index paradigm adopted by current methods greatly restricts the utilization of LLMs capacity for recommendation, leading to not only the insufficient alignment between semantic and collaborative knowledge, but also the neglect of high-order user-item interaction patterns. In this paper, we propose Twin-Tower Dynamic Semantic Recommender (TTDS), the first generative RS which adopts dynamic semantic index paradigm, targeting at resolving the above problems simultaneously. To be more specific, we for the first time contrive a dynamic knowledge fusion framework which integrates a twin-tower semantic token generator into the LLM-based recommender, hierarchically allocating meaningful semantic index for items and users, and accordingly predicting the semantic index of target item. Furthermore, a dual-modality variational auto-encoder is proposed to facilitate multi-grained alignment between semantic and collaborative knowledge. Eventually, a series of novel tuning tasks specially customized for capturing high-order user-item interaction patterns are proposed to take advantages of user historical behavior. Extensive experiments across three public datasets demonstrate the superiority of the proposed methodology in developing LLM-based generative RSs. The proposed TTDS recommender achieves an average improvement of 19.41% in Hit-Rate and 20.84% in NDCG metric, compared with the leading baseline methods.

proceedings, recommendation, semantic index, (12 more...)

arXiv.org Artificial Intelligence

2409.09253

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Hunan Province (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Heterogeneous Federated Learning with Convolutional and Spiking Neural Networks

Yu, Yingchao, Yan, Yuping, Cai, Jisong, Jin, Yaochu

arXiv.org Artificial IntelligenceJun-13-2024

Federated learning (FL) has emerged as a promising paradigm for training models on decentralized data while safeguarding data privacy. Most existing FL systems, however, assume that all machine learning models are of the same type, although it becomes more likely that different edge devices adopt different types of AI models, including both conventional analogue artificial neural networks (ANNs) and biologically more plausible spiking neural networks (SNNs). This diversity empowers the efficient handling of specific tasks and requirements, showcasing the adaptability and versatility of edge computing platforms. One main challenge of such heterogeneous FL system lies in effectively aggregating models from the local devices in a privacy-preserving manner. To address the above issue, this work benchmarks FL systems containing both convoluntional neural networks (CNNs) and SNNs by comparing various aggregation approaches, including federated CNNs, federated SNNs, federated CNNs for SNNs, federated SNNs for CNNs, and federated CNNs with SNN fusion. Experimental results demonstrate that the CNN-SNN fusion framework exhibits the best performance among the above settings on the MNIST dataset. Additionally, intriguing phenomena of competitive suppression are noted during the convergence process of multi-model FL.

cnn, neural network, snn, (15 more...)

arXiv.org Artificial Intelligence

2406.0968

Country: Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Spectrogram Analysis in a Multiple Classifier Fusion Framework for Power Grid Classification Using Electric Network Frequency

Tzolopoulos, Georgios, Korgialas, Christos, Kotropoulos, Constantine

arXiv.org Artificial IntelligenceMar-27-2024

The Electric Network Frequency (ENF) serves as a unique signature inherent to power distribution systems. Here, a novel approach for power grid classification is developed, leveraging ENF. Spectrograms are generated from audio and power recordings across different grids, revealing distinctive ENF patterns that aid in grid classification through a fusion of classifiers. Four traditional machine learning classifiers plus a Convolutional Neural Network (CNN), optimized using Neural Architecture Search, are developed for One-vs-All classification. This process generates numerous predictions per sample, which are then compiled and used to train a shallow multi-label neural network specifically designed to model the fusion process, ultimately leading to the conclusive class prediction for each sample. Experimental findings reveal that both validation and testing accuracy outperform those of current state-of-the-art classifiers, underlining the effectiveness and robustness of the proposed methodology.

accuracy, classification, classifier, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.5220/0012418400003654

2403.18402

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents

Cao, Ke, Liu, Ruiping, Wang, Ze, Peng, Kunyu, Zhang, Jiaming, Zheng, Junwei, Teng, Zhifeng, Yang, Kailun, Stiefelhagen, Rainer

arXiv.org Artificial IntelligenceDec-25-2023

The mobile robot relies on SLAM (Simultaneous Localization and Mapping) to provide autonomous navigation and task execution in complex and unknown environments. However, it is hard to develop a dedicated algorithm for mobile robots due to dynamic and challenging situations, such as poor lighting conditions and motion blur. To tackle this issue, we propose a tightly-coupled LiDAR-visual SLAM based on geometric features, which includes two sub-systems (LiDAR and monocular visual SLAM) and a fusion framework. The fusion framework associates the depth and semantics of the multi-modal geometric features to complement the visual line landmarks and to add direction optimization in Bundle Adjustment (BA). This further constrains visual odometry. On the other hand, the entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem, which can only perform the local calculation for geometric features. It adjusts the direction of linear feature points and filters out outliers, leading to a higher accurate odometry system. Finally, we employ a module to detect the subsystem's operation, providing the LiDAR subsystem's output as a complementary trajectory to our system while visual subsystem tracking fails. The evaluation results on the public dataset M2DGR, gathered from ground robots across various indoor and outdoor scenarios, show that our system achieves more accurate and robust pose estimation compared to current state-of-the-art multi-modal methods.

geometric feature, point cloud, subsystem, (16 more...)

arXiv.org Artificial Intelligence

2307.07763

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)

Add feedback

To Fuse or Not to Fuse: Measuring Consistency in Multi-Sensor Fusion for Aerial Robots

Lanegger, Christian, Oleynikova, Helen, Pantic, Michael, Ott, Lionel, Siegwart, Roland

arXiv.org Artificial IntelligenceDec-22-2023

Aerial vehicles are no longer limited to flying in open space: recent work has focused on aerial manipulation and up-close inspection. Such applications place stringent requirements on state estimation: the robot must combine state information from many sources, including onboard odometry and global positioning sensors. However, flying close to or in contact with structures is a degenerate case for many sensing modalities, and the robot's state estimation framework must intelligently choose which sensors are currently trustworthy. We evaluate a number of metrics to judge the reliability of sensing modalities in a multi-sensor fusion framework, then introduce a consensus-finding scheme that uses this metric to choose which sensors to fuse or not to fuse. Finally, we show that such a fusion framework is more robust and accurate than fusing all sensors all the time and demonstrate how such metrics can be informative in real-world experiments in indoor-outdoor flight and bridge inspection.

local estimate, metric, sensor, (16 more...)

arXiv.org Artificial Intelligence

2312.1473

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Costa Rica > Heredia Province > Heredia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.71)

Add feedback

MaxCorrMGNN: A Multi-Graph Neural Network Framework for Generalized Multimodal Fusion of Medical Data for Outcome Prediction

D'Souza, Niharika S., Wang, Hongzhi, Giovannini, Andrea, Foncubierta-Rodriguez, Antonio, Beck, Kristen L., Boyko, Orest, Syeda-Mahmood, Tanveer

arXiv.org Artificial IntelligenceJul-13-2023

With the emergence of multimodal electronic health records, the evidence for an outcome may be captured across multiple modalities ranging from clinical to imaging and genomic data. Predicting outcomes effectively requires fusion frameworks capable of modeling fine-grained and multi-faceted complex interactions between modality features within and across patients. We develop an innovative fusion approach called MaxCorr MGNN that models non-linear modality correlations within and across patients through Hirschfeld-Gebelein-Renyi maximal correlation (MaxCorr) embeddings, resulting in a multi-layered graph that preserves the identities of the modalities and patients. We then design, for the first time, a generalized multi-layered graph neural network (MGNN) for task-informed reasoning in multi-layered graphs, that learns the parameters defining patient-modality graph connectivity and message passing in an end-to-end fashion. We evaluate our model an outcome prediction task on a Tuberculosis (TB) dataset consistently outperforming several state-of-the-art neural, graph-based and traditional fusion techniques.

artificial intelligence, machine learning, prediction, (20 more...)

arXiv.org Artificial Intelligence

2307.07093

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > Nevada (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Event Camera and LiDAR based Human Tracking for Adverse Lighting Conditions in Subterranean Environments

Saucedo, Mario A. V., Patel, Akash, Sawlekar, Rucha, Saradagi, Akshit, Kanellakis, Christoforos, Agha-Mohammadi, Ali-Akbar, Nikolakopoulos, George

arXiv.org Artificial IntelligenceApr-18-2023

In this article, we propose a novel LiDAR and event camera fusion modality for subterranean (SubT) environments for fast and precise object and human detection in a wide variety of adverse lighting conditions, such as low or no light, high-contrast zones and in the presence of blinding light sources. In the proposed approach, information from the event camera and LiDAR are fused to localize a human or an object-of-interest in a robot's local frame. The local detection is then transformed into the inertial frame and used to set references for a Nonlinear Model Predictive Controller (NMPC) for reactive tracking of humans or objects in SubT environments. The proposed novel fusion uses intensity filtering and K-means clustering on the LiDAR point cloud and frequency filtering and connectivity clustering on the events induced in an event camera by the returning LiDAR beams. The centroids of the clusters in the event camera and LiDAR streams are then paired to localize reflective markers present on safety vests and signs in SubT environments. The efficacy of the proposed scheme has been experimentally validated in a real SubT environment (a mine) with a Pioneer 3AT mobile robot. The experimental results show real-time performance for human detection and the NMPC-based controller allows for reactive tracking of a human or object of interest, even in complete darkness.

artificial intelligence, machine learning, subt environment, (17 more...)

arXiv.org Artificial Intelligence

2304.08908

Country: Europe > Sweden (0.47)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)

Add feedback