AITopics | Guo, Yi

Collaborating Authors

Guo, Yi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Learning-Based Direct Leaf Area Estimation using Two RGBD Datasets for Model Development

Jayasuriya, Namal, Guo, Yi, Hu, Wen, Ghannoum, Oula

arXiv.org Artificial IntelligenceMar-13-2025

Estimation of a single leaf area can be a measure of crop growth and a phenotypic trait to breed new varieties. It has also been used to measure leaf area index and total leaf area. Some studies have used hand-held cameras, image processing 3D reconstruction and unsupervised learning-based methods to estimate the leaf area in plant images. Deep learning works well for object detection and segmentation tasks; however, direct area estimation of objects has not been explored. This work investigates deep learning-based leaf area estimation, for RGBD images taken using a mobile camera setup in real-world scenarios. A dataset for attached leaves captured with a top angle view and a dataset for detached single leaves were collected for model development and testing. First, image processing-based area estimation was tested on manually segmented leaves. Then a Mask R-CNN-based model was investigated, and modified to accept RGBD images and to estimate the leaf area. The detached-leaf data set was then mixed with the attached-leaf plant data set to estimate the single leaf area for plant images, and another network design with two backbones was proposed: one for segmentation and the other for area estimation. Instead of trying all possibilities or random values, an agile approach was used in hyperparameter tuning. The final model was cross-validated with 5-folds and tested with two unseen datasets: detached and attached leaves. The F1 score with 90% IoA for segmentation result on unseen detached-leaf data was 1.0, while R-squared of area estimation was 0.81. For unseen plant data segmentation, the F1 score with 90% IoA was 0.59, while the R-squared score was 0.57. The research suggests using attached leaves with ground truth area to improve the results.

area estimation, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.10129

Country: Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EEG-ReMinD: Enhancing Neurodegenerative EEG Decoding through Self-Supervised State Reconstruction-Primed Riemannian Dynamics

Wang, Zirui, Song, Zhenxi, Guo, Yi, Liu, Yuxin, Xu, Guoyang, Zhang, Min, Zhang, Zhiguo

arXiv.org Artificial IntelligenceJan-14-2025

The development of EEG decoding algorithms confronts challenges such as data sparsity, subject variability, and the need for precise annotations, all of which are vital for advancing brain-computer interfaces and enhancing the diagnosis of diseases. To address these issues, we propose a novel two-stage approach named Self-Supervised State Reconstruction-Primed Riemannian Dynamics (EEG-ReMinD) , which mitigates reliance on supervised learning and integrates inherent geometric features. This approach efficiently handles EEG data corruptions and reduces the dependency on labels. EEG-ReMinD utilizes self-supervised and geometric learning techniques, along with an attention mechanism, to analyze the temporal dynamics of EEG features within the framework of Riemannian geometry, referred to as Riemannian dynamics. Comparative analyses on both intact and corrupted datasets from two different neurodegenerative disorders underscore the enhanced performance of EEG-ReMinD.

artificial intelligence, deep learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2501.08139

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Cycle-Consistent Bridge Diffusion Model for Accelerated MRI Reconstruction

Song, Tao, Wu, Yicheng, Hu, Minhao, Luo, Xiangde, Luo, Guoting, Wang, Guotai, Guo, Yi, Xu, Feng, Zhang, Shaoting

arXiv.org Artificial IntelligenceDec-13-2024

Accelerated MRI reconstruction techniques aim to reduce examination time while maintaining high image fidelity, which is highly desirable in clinical settings for improving patient comfort and hospital efficiency. Existing deep learning methods typically reconstruct images from under-sampled data with traditional reconstruction approaches, but they still struggle to provide high-fidelity results. Diffusion models show great potential to improve fidelity of generated images in recent years. However, their inference process starting with a random Gaussian noise introduces instability into the results and usually requires thousands of sampling steps, resulting in sub-optimal reconstruction quality and low efficiency. To address these challenges, we propose Cycle-Consistent Bridge Diffusion Model (CBDM). CBDM employs two bridge diffusion models to construct a cycle-consistent diffusion process with a consistency loss, enhancing the fine-grained details of reconstructed images and reducing the number of diffusion steps. Moreover, CBDM incorporates a Contourlet Decomposition Embedding Module (CDEM) which captures multi-scale structural texture knowledge in images through frequency domain decomposition pyramids and directional filter banks to improve structural fidelity. Extensive experiments demonstrate the superiority of our model by higher reconstruction quality and fewer training iterations, achieving a new state of the art for accelerated MRI reconstruction in both fastMRI and IXI datasets.

artificial intelligence, machine learning, reconstruction, (15 more...)

arXiv.org Artificial Intelligence

2412.09998

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Defensive Dual Masking for Robust Adversarial Defense

Yang, Wangli, Yang, Jie, Guo, Yi, Barthelemy, Johan

arXiv.org Artificial IntelligenceDec-9-2024

The field of textual adversarial defenses has gained considerable attention in recent years due to the increasing vulnerability of natural language processing (NLP) models to adversarial attacks, which exploit subtle perturbations in input text to deceive models. This paper introduces the Defensive Dual Masking (DDM) algorithm, a novel approach designed to enhance model robustness against such attacks. DDM utilizes a unique adversarial training strategy where [MASK] tokens are strategically inserted into training samples to prepare the model to handle adversarial perturbations more effectively. During inference, potentially adversarial tokens are dynamically replaced with [MASK] tokens to neutralize potential threats while preserving the core semantics of the input. The theoretical foundation of our approach is explored, demonstrating how the selective masking mechanism strengthens the model's ability to identify and mitigate adversarial manipulations. Our empirical evaluation across a diverse set of benchmark datasets and attack mechanisms consistently shows that DDM outperforms state-of-the-art defense techniques, improving model accuracy and robustness. Moreover, when applied to Large Language Models (LLMs), DDM also enhances their resilience to adversarial attacks, providing a scalable defense mechanism for large-scale NLP applications.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2412.07078

Country:

Europe (0.68)
North America > United States (0.46)
North America > Mexico (0.28)

Genre:

Overview (0.66)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (0.71)
Government > Military (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Cross Group Attention and Group-wise Rolling for Multimodal Medical Image Synthesis

Song, Tao, Wu, Yicheng, Hu, Minhao, Luo, Xiangde, Wei, Linda, Wang, Guotai, Guo, Yi, Xu, Feng, Zhang, Shaoting

arXiv.org Artificial IntelligenceNov-21-2024

Multimodal MR image synthesis aims to generate missing modality image by fusing and mapping a few available MRI data. Most existing approaches typically adopt an image-to-image translation scheme. However, these methods often suffer from sub-optimal performance due to the spatial misalignment between different modalities while they are typically treated as input channels. Therefore, in this paper, we propose an Adaptive Group-wise Interaction Network (AGI-Net) that explores both inter-modality and intra-modality relationships for multimodal MR image synthesis. Specifically, groups are first pre-defined along the channel dimension and then we perform an adaptive rolling for the standard convolutional kernel to capture inter-modality spatial correspondences. At the same time, a cross-group attention module is introduced to fuse information across different channel groups, leading to better feature representation. We evaluated the effectiveness of our model on the publicly available IXI and BraTS2023 datasets, where the AGI-Net achieved state-of-the-art performance for multimodal MR image synthesis.

artificial intelligence, image synthesis, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.14684

Country: Asia > China (0.69)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Diff-CXR: Report-to-CXR generation through a disease-knowledge enhanced diffusion model

Huang, Peng, Guo, Bowen, Liang, Shuyu, Fu, Junhu, Wang, Yuanyuan, Guo, Yi

arXiv.org Artificial IntelligenceOct-26-2024

Text-To-Image (TTI) generation is significant for controlled and diverse image generation with broad potential applications. Although current medical TTI methods have made some progress in report-to-Chest-Xray (CXR) generation, their generation performance may be limited due to the intrinsic characteristics of medical data. In this paper, we propose a novel disease-knowledge enhanced Diffusion-based TTI learning framework, named Diff-CXR, for medical report-to-CXR generation. First, to minimize the negative impacts of noisy data on generation, we devise a Latent Noise Filtering Strategy that gradually learns the general patterns of anomalies and removes them in the latent space. Then, an Adaptive Vision-Aware Textual Learning Strategy is designed to learn concise and important report embeddings in a domain-specific Vision-Language Model, providing textual guidance for Chest-Xray generation. Finally, by incorporating the general disease knowledge into the pretrained TTI model via a delicate control adapter, a disease-knowledge enhanced diffusion model is introduced to achieve realistic and precise report-to-CXR generation. Experimentally, our Diff-CXR outperforms previous SOTA medical TTI methods by 33.4\% / 8.0\% and 23.8\% / 56.4\% in the FID and mAUC score on MIMIC-CXR and IU-Xray, with the lowest computational complexity at 29.641 GFLOPs. Downstream experiments on three thorax disease classification benchmarks and one CXR-report generation benchmark demonstrate that Diff-CXR is effective in improving classical CXR analysis methods. Notably, models trained on the combination of 1\% real data and synthetic data can achieve a competitive mAUC score compared to models trained on all data, presenting promising clinical applications.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.20165

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

When Graph Neural Networks Meet Dynamic Mode Decomposition

Shi, Dai, Lin, Lequan, Han, Andi, Wang, Zhiyong, Guo, Yi, Gao, Junbin

arXiv.org Artificial IntelligenceOct-7-2024

Graph Neural Networks (GNNs) have emerged as fundamental tools for a wide range of prediction tasks on graph-structured data. Recent studies have drawn analogies between GNN feature propagation and diffusion processes, which can be interpreted as dynamical systems. In this paper, we delve deeper into this perspective by connecting the dynamics in GNNs to modern Koopman theory and its numerical method, Dynamic Mode Decomposition (DMD). We illustrate how DMD can estimate a low-rank, finite-dimensional linear operator based on multiple states of the system, effectively approximating potential nonlinear interactions between nodes in the graph. This approach allows us to capture complex dynamics within the graph accurately and efficiently. We theoretically establish a connection between the DMD-estimated operator and the original dynamic operator between system states. Building upon this foundation, we introduce a family of DMD-GNN models that effectively leverage the low-rank eigenfunctions provided by the DMD algorithm. We further discuss the potential of enhancing our approach by incorporating domain-specific constraints such as symmetry into the DMD computation, allowing the corresponding GNN models to respect known physical properties of the underlying system. Our work paves the path for applying advanced dynamical system analysis tools via GNNs. We validate our approach through extensive experiments on various learning tasks, including directed graphs, large-scale graphs, long-range interactions, and spatial-temporal graphs. We also empirically verify that our proposed models can serve as powerful encoders for link prediction tasks. The results demonstrate that our DMD-enhanced GNNs achieve state-of-the-art performance, highlighting the effectiveness of integrating DMD into GNN frameworks.

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.05593

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations

Song, Zhenxi, Qin, Ruihan, Ren, Huixia, Liang, Zhen, Guo, Yi, Zhang, Min, Zhang, Zhiguo

arXiv.org Artificial IntelligenceApr-29-2024

Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Manifold Attention and Confidence Stratification (MACS) to diagnose neurodegenerative disorders based on EEG signals sourced from four centers with unreliable annotations. The MACS framework's effectiveness stems from these features: 1) The Augmentor generates various EEG-represented brain variants to enrich the data space; 2) The Switcher enhances the feature space for trusted samples and reduces overfitting on incorrectly labeled samples; 3) The Encoder uses the Riemannian manifold and Euclidean metrics to capture spatiotemporal variations and dynamic synchronization in EEG; 4) The Projector, equipped with dual heads, monitors consistency across multiple brain variants and ensures diagnostic accuracy; 5) The Stratifier adaptively stratifies learned samples by confidence levels throughout the training process; 6) Forward and backpropagation in MACS are constrained by confidence stratification to stabilize the learning system amid unreliable annotations. Our subject-independent experiments, conducted on both neurocognitive and movement disorders using cross-center corpora, have demonstrated superior performance compared to existing related algorithms. This work not only improves EEG-based diagnostics for cross-center and small-setting brain diseases but also offers insights into extending MACS techniques to other data analyses, tackling data heterogeneity and annotation unreliability in multimedia and multimodal content understanding.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2405.00734

Country: Asia > China (0.30)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points

Guo, Yi, Kong, Fanliu, Li, Xiaoyang, Li, Hui, Chen, Wei, Tian, Xiaogang, Cai, Jinping, Zhang, Yang, Liu, Shouda

arXiv.org Artificial IntelligenceApr-19-2024

Quantization emerges as one of the most promising compression technologies for deploying efficient large models for various real time application in recent years. Considering that the storage and IO of weights take up the vast majority of the overhead inside a large model, weight only quantization can lead to large gains. However, existing quantization schemes suffer from significant accuracy degradation at very low bits, or require some additional computational overhead when deployed, making it difficult to be applied to large-scale applications in industry. In this paper, we propose decoupleQ, achieving a substantial increase in model accuracy, especially at very low bits. decoupleQ abandons the traditional heuristic quantization paradigm and decouples the model parameters into integer and floating-point parts, thus transforming the quantization problem into a traditional mathematical optimization problem with constraints, which is then solved alternatively by off-the-shelf optimization methods. Quantization via decoupleQ is linear and uniform, making it hardware-friendlier than non-uniform counterpart, and enabling the idea to be migrated to high-bit quantization to enhance its robustness. Our method has achieved well on-line accuracy near fp16/bf16 on the 2-bit quantization of large speech models in ByteDance. The code is available at https://github.com/bytedance/decoupleQ

artificial intelligence, optimization problem, quantization, (15 more...)

arXiv.org Artificial Intelligence

2404.12759

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Narrative Feature or Structured Feature? A Study of Large Language Models to Identify Cancer Patients at Risk of Heart Failure

Chen, Ziyi, Zhang, Mengyuan, Ahmed, Mustafa Mohammed, Guo, Yi, George, Thomas J., Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceMar-17-2024

Cancer treatments are known to introduce cardiotoxicity, negatively impacting outcomes and survivorship. Identifying cancer patients at risk of heart failure (HF) is critical to improving cancer treatment outcomes and safety. This study examined machine learning (ML) models to identify cancer patients at risk of HF using electronic health records (EHRs), including traditional ML, Time-Aware long short-term memory (T-LSTM), and large language models (LLMs) using novel narrative features derived from the structured medical codes. We identified a cancer cohort of 12,806 patients from the University of Florida Health, diagnosed with lung, breast, and colorectal cancers, among which 1,602 individuals developed HF after cancer. The LLM, GatorTron-3.9B, achieved the best F1 scores, outperforming the traditional support vector machines by 39%, the T-LSTM deep learning model by 7%, and a widely used transformer model, BERT, by 5.6%. The analysis shows that the proposed narrative features remarkably increased feature density and improved performance.

machine learning, natural language, prediction, (19 more...)

arXiv.org Artificial Intelligence

2403.11425

Country: North America > United States > Florida > Alachua County > Gainesville (0.15)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback