Cheng, Mingxi
Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning
Xiao, Xiongye, Liu, Gengshuo, Gupta, Gaurav, Cao, Defu, Li, Shixuan, Li, Yaxing, Fang, Tianqing, Cheng, Mingxi, Bogdan, Paul
Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most traditional fusion models that incorporate all modalities identically in neural networks, our model designates a prime modality and regards the remaining modalities as detectors in the information pathway, serving to distill the flow of information. Our proposed perception model focuses on constructing an effective and compact information flow by achieving a balance between the minimization of mutual information between the latent state and the input modal state, and the maximization of mutual information between the latent states and the remaining modal states. This approach leads to compact latent state representations that retain relevant information while minimizing redundancy, thereby substantially enhancing the performance of multimodal representation learning. Experimental evaluations on the MUStARD, CMU-MOSI, and CMU-MOSEI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. Remarkably, on the CMU-MOSI dataset, ITHP surpasses human-level performance in the multimodal sentiment binary classification task across all evaluation metrics (i.e., Binary Accuracy, F1 Score, Mean Absolute Error, and Pearson Correlation).
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks
Cheng, Anzhe, Wang, Zhenkun, Yin, Chenzhong, Cheng, Mingxi, Ping, Heng, Xiao, Xiongye, Nazarian, Shahin, Bogdan, Paul
Backpropagation (BP) has been a successful optimization technique for deep learning models. However, its limitations, such as backward- and update-locking, and its biological implausibility, hinder the concurrent updating of layers and do not mimic the local learning processes observed in the human brain. To address these issues, recent research has suggested using local error signals to asynchronously train network blocks. However, this approach often involves extensive trial-and-error iterations to determine the best configuration for local training. This includes decisions on how to decouple network blocks and which auxiliary networks to use for each block. In our work, we introduce a novel BP-free approach: a block-wise BP-free (BWBPF) neural network that leverages local error signals to optimize distinct sub-neural networks separately, where the global loss is only responsible for updating the output layer. The local error signals used in the BP-free model can be computed in parallel, enabling a potential speed-up in the weight update process through parallel implementation. Our experimental results consistently show that this approach can identify transferable decoupled architectures for VGG and ResNet variations, outperforming models trained with end-to-end backpropagation and other state-of-the-art block-wise learning techniques on datasets such as CIFAR-10 and Tiny-ImageNet. The code is released at https://github.com/Belis0811/BWBPF.
Discovering Malicious Signatures in Software from Structural Interactions
Yin, Chenzhong, Zhang, Hantang, Cheng, Mingxi, Xiao, Xiongye, Chen, Xinghe, Ren, Xin, Bogdan, Paul
Malware represents a significant security concern in today's digital landscape, as it can destroy or disable operating systems, steal sensitive user information, and occupy valuable disk space. However, current malware detection methods, such as static-based and dynamic-based approaches, struggle to identify newly developed (``zero-day") malware and are limited by customized virtual machine (VM) environments. To overcome these limitations, we propose a novel malware detection approach that leverages deep learning, mathematical techniques, and network science. Our approach focuses on static and dynamic analysis and utilizes the Low-Level Virtual Machine (LLVM) to profile applications within a complex network. The generated network topologies are input into the GraphSAGE architecture to efficiently distinguish between benign and malicious software applications, with the operation names denoted as node features. Importantly, the GraphSAGE models analyze the network's topological geometry to make predictions, enabling them to detect state-of-the-art malware and prevent potential damage during execution in a VM. To evaluate our approach, we conduct a study on a dataset comprising source code from 24,376 applications, specifically written in C/C++, sourced directly from widely-recognized malware and various types of benign software. The results show a high detection performance with an Area Under the Receiver Operating Characteristic Curve (AUROC) of 99.85%. Our approach marks a substantial improvement in malware detection, providing a notably more accurate and efficient solution when compared to current state-of-the-art malware detection methods.
Neuro-Inspired Hierarchical Multimodal Learning
Xiao, Xiongye, Liu, Gengshuo, Gupta, Gaurav, Cao, Defu, Li, Shixuan, Li, Yaxing, Fang, Tianqing, Cheng, Mingxi, Bogdan, Paul
Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Distinct from most traditional fusion models that aim to incorporate all modalities as input, our model designates the prime modality as input, while the remaining modalities act as detectors in the information pathway. Our proposed perception model focuses on constructing an effective and compact information flow by achieving a balance between the minimization of mutual information between the latent state and the input modal state, and the maximization of mutual information between the latent states and the remaining modal states. This approach leads to compact latent state representations that retain relevant information while minimizing redundancy, thereby substantially enhancing the performance of downstream tasks. Experimental evaluations on both the MUStARD and CMU-MOSI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. Remarkably, on the CMU-MOSI dataset, ITHP-DeBERTa surpasses human-level performance in the multimodal sentiment binary classification task across all evaluation metrics (i.e., Binary Accuracy, F1 Score, Mean Absolute Error, and Pearson Correlation).
Leader-Follower Neural Networks with Local Error Signals Inspired by Complex Collectives
Yin, Chenzhong, Cheng, Mingxi, Xiao, Xiongye, Chen, Xinghe, Nazarian, Shahin, Irimia, Andrei, Bogdan, Paul
The collective behavior of a network with heterogeneous, resource-limited information processing units (e.g., group of fish, flock of birds, or network of neurons) demonstrates high self-organization and complexity. These emergent properties arise from simple interaction rules where certain individuals can exhibit leadership-like behavior and influence the collective activity of the group. Motivated by the intricacy of these collectives, we propose a neural network (NN) architecture inspired by the rules observed in nature's collective ensembles. This NN structure contains workers that encompass one or more information processing units (e.g., neurons, filters, layers, or blocks of layers). Workers are either leaders or followers, and we train a leader-follower neural network (LFNN) by leveraging local error signals and optionally incorporating backpropagation (BP) and global loss. We investigate worker behavior and evaluate LFNNs through extensive experimentation. Our LFNNs trained with local error signals achieve significantly lower error rates than previous BP-free algorithms on MNIST and CIFAR-10 and even surpass BP-enabled baselines. In the case of ImageNet, our LFNN-l demonstrates superior scalability and outperforms previous BP-free algorithms by a significant margin.
Fractional dynamics foster deep learning of COPD stage prediction
Yin, Chenzhong, Udrescu, Mihai, Gupta, Gaurav, Cheng, Mingxi, Lihu, Andrei, Udrescu, Lucretia, Bogdan, Paul, Mannino, David M, Mihaicuta, Stefan
Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death worldwide, usually associated with smoking and environmental occupational exposures. Prior studies have shown that current COPD diagnosis (i.e., spirometry test) can be unreliable because the test can be difficult to do and depends on an adequate effort from the testee and supervision of the testor. Moreover, the extensive early detection and diagnosis of COPD is challenging. We address the COPD detection problem by constructing two novel COPD physiological signals datasets (4432 medical records from 54 patients in the WestRo COPD dataset and 13824 medical records from 534 patients in the WestRo Porti COPD dataset), demonstrating their complex coupled fractal dynamical characteristics, and performing a rigorous fractional-order dynamics deep learning analysis to diagnose COPD with high accuracy. We find that the fractional-order dynamical modeling can extract distinguishing signatures from the physiological signals across patients with all COPD stages--from stage 0 (healthy) to stage 4 (very severe). We exploit these fractional signatures to develop and train a deep neural network that predicts the suspected patients' COPD stages based on the input features (such as thorax breathing effort, respiratory rate, or oxygen saturation levels). We show that our COPD diagnostics method (fractional dynamic deep learning model) achieves a high prediction accuracy (98.66% 0.45%) on WestRo COPD dataset and can serve as an excellent and robust alternative to traditional spirometry-based medical diagnosis. Our fractional dynamic deep learning model (FDDLM) for COPD diagnosis also presents high prediction accuracy when validated by a dataset with different physiological signals recorded (i.e., 94.01%
Trust-aware Control for Intelligent Transportation Systems
Cheng, Mingxi, Zhang, Junyao, Nazarian, Shahin, Deshmukh, Jyotirmoy, Bogdan, Paul
Many intelligent transportation systems are multi-agent systems, i.e., both the traffic participants and the subsystems within the transportation infrastructure can be modeled as interacting agents. The use of AI-based methods to achieve coordination among the different agents systems can provide greater safety over transportation systems containing only human-operated vehicles, and also improve the system efficiency in terms of traffic throughput, sensing range, and enabling collaborative tasks. However, increased autonomy makes the transportation infrastructure vulnerable to compromised vehicular agents or infrastructure. This paper proposes a new framework by embedding the trust authority into transportation infrastructure to systematically quantify the trustworthiness of agents using an epistemic logic known as subjective logic. In this paper, we make the following novel contributions: (i) We propose a framework for using the quantified trustworthiness of agents to enable trust-aware coordination and control. (ii) We demonstrate how to synthesize trust-aware controllers using an approach based on reinforcement learning. (iii) We comprehensively analyze an autonomous intersection management (AIM) case study and develop a trust-aware version called AIM-Trust that leads to lower accident rates in scenarios consisting of a mixture of trusted and untrusted agents.