Dai, Haixing
Exploring Multimodal Approaches for Alzheimer's Disease Detection Using Patient Speech Transcript and Audio Data
Cai, Hongmin, Huang, Xiaoke, Liu, Zhengliang, Liao, Wenxiong, Dai, Haixing, Wu, Zihao, Zhu, Dajiang, Ren, Hui, Li, Quanzheng, Liu, Tianming, Li, Xiang
Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health. As AD impairs the patient's language understanding and expression ability, the speech of AD patients can serve as an indicator of this disease. This study investigates various methods for detecting AD using patients' speech and transcripts data from the DementiaBank Pitt database. The proposed approach involves pre-trained language models and Graph Neural Network (GNN) that constructs a graph from the speech transcript, and extracts features using GNN for AD detection. Data augmentation techniques, including synonym replacement, GPT-based augmenter, and so on, were used to address the small dataset size. Audio data was also introduced, and WavLM model was used to extract audio features. These features were then fused with text features using various methods. Finally, a contrastive learning approach was attempted by converting speech transcripts back to audio and using it for contrastive learning with the original audio. We conducted intensive experiments and analysis on the above methods. Our findings shed light on the challenges and potential solutions in AD detection using speech and audio data.
Segment Anything Model (SAM) for Radiation Oncology
Zhang, Lian, Liu, Zhengliang, Zhang, Lu, Wu, Zihao, Yu, Xiaowei, Holmes, Jason, Feng, Hongying, Dai, Haixing, Li, Xiang, Li, Quanzheng, Zhu, Dajiang, Liu, Tianming, Liu, Wei
In this study, we evaluate the performance of the Segment Anything Model (SAM) in clinical radiotherapy. Our results indicate that SAM's 'segment anything' mode can achieve clinically acceptable segmentation results in most organs-at-risk (OARs) with Dice scores higher than 0.7. SAM's 'box prompt' mode further improves the Dice scores by 0.1 to 0.5. Considering the size of the organ and the clarity of its boundary, SAM displays better performance for large organs with clear boundaries but performs worse for smaller organs with unclear boundaries. Given that SAM, a model pre-trained purely on natural images, can handle the delineation of OARs from medical images with clinically acceptable accuracy, these results highlight SAM's robust generalization capabilities with consistent accuracy in automatic segmentation for radiotherapy. In other words, SAM can achieve delineation of different OARs at different sites using a generic automatic segmentation model. SAM's generalization capabilities across different disease sites suggest that it is technically feasible to develop a generic model for automatic segmentation in radiotherapy.
Identification of Causal Relationship between Amyloid-beta Accumulation and Alzheimer's Disease Progression via Counterfactual Inference
Dai, Haixing, Hu, Mengxuan, Li, Qing, Zhang, Lu, Zhao, Lin, Zhu, Dajiang, Diez, Ibai, Sepulcre, Jorge, Zhang, Fan, Gao, Xingyu, Liu, Manhua, Li, Quanzheng, Li, Sheng, Liu, Tianming, Li, Xiang
Alzheimer's disease (AD) is a neurodegenerative disorder that is beginning with amyloidosis, followed by neuronal loss and deterioration in structure, function, and cognition. The accumulation of amyloid-beta in the brain, measured through 18F-florbetapir (AV45) positron emission tomography (PET) imaging, has been widely used for early diagnosis of AD. However, the relationship between amyloid-beta accumulation and AD pathophysiology remains unclear, and causal inference approaches are needed to uncover how amyloid-beta levels can impact AD development. In this paper, we propose a graph varying coefficient neural network (GVCNet) for estimating the individual treatment effect with continuous treatment levels using a graph convolutional neural network. We highlight the potential of causal inference approaches, including GVCNet, for measuring the regional causal connections between amyloid-beta accumulation and AD pathophysiology, which may serve as a robust tool for early diagnosis and tailored care.
Review of Large Vision Models and Visual Prompt Engineering
Wang, Jiaqi, Liu, Zhengliang, Zhao, Lin, Wu, Zihao, Ma, Chong, Yu, Sigang, Dai, Haixing, Yang, Qiushi, Liu, Yiheng, Zhang, Songyao, Shi, Enze, Pan, Yi, Zhang, Tuo, Zhu, Dajiang, Li, Xiang, Jiang, Xi, Ge, Bao, Yuan, Yixuan, Shen, Dinggang, Liu, Tianming, Zhang, Shu
Visual prompt engineering is a fundamental technology in the field of visual and image Artificial General Intelligence, serving as a key component for achieving zero-shot capabilities. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.
Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications
Rezayi, Saed, Liu, Zhengliang, Wu, Zihao, Dhakal, Chandra, Ge, Bao, Dai, Haixing, Mai, Gengchen, Liu, Ninghao, Zhen, Chen, Liu, Tianming, Li, Sheng
This paper explores new frontiers in agricultural natural language processing by investigating the effectiveness of using food-related text corpora for pretraining transformer-based language models. In particular, we focus on the task of semantic matching, which involves establishing mappings between food descriptions and nutrition data. To accomplish this, we fine-tune a pre-trained transformer-based language model, AgriBERT, on this task, utilizing an external source of knowledge, such as the FoodOn ontology. To advance the field of agricultural NLP, we propose two new avenues of exploration: (1) utilizing GPT-based models as a baseline and (2) leveraging ChatGPT as an external source of knowledge. ChatGPT has shown to be a strong baseline in many NLP tasks, and we believe it has the potential to improve our model in the task of semantic matching and enhance our model's understanding of food-related concepts and relationships. Additionally, we experiment with other applications, such as cuisine prediction based on food ingredients, and expand the scope of our research to include other NLP tasks beyond semantic matching. Overall, this paper provides promising avenues for future research in this field, with potential implications for improving the performance of agricultural NLP applications.
AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology
Dai, Haixing, Li, Yiwei, Liu, Zhengliang, Zhao, Lin, Wu, Zihao, Song, Suhang, Shen, Ye, Zhu, Dajiang, Li, Xiang, Li, Sheng, Yao, Xiaobai, Shi, Lu, Li, Quanzheng, Chen, Zhuo, Zhang, Donglan, Mai, Gengchen, Liu, Tianming
This disease, characterized by cognitive impairments such as memory loss, predominantly affects aging populations, exerting an escalating burden on global healthcare systems as societies continue to age [3]. The significance of AD is further magnified by the increasing life expectancy globally, with the disease now recognized as a leading cause of disability and dependency among older people [4]. Consequently, AD has substantial social, economic, and health system implications, making its understanding and awareness of paramount importance [5, 6]. Despite the ubiquity and severity of AD, a gap persists in comprehensive, data-driven public understanding of this complex health narrative. Traditionally, public health professionals have to rely on labor-intensive methods such as web scraping, API data collection, data postprocessing, and analysis/synthesis to gather insights from news media, health reports, and other textual sources [7, 8, 9].
Radiology-GPT: A Large Language Model for Radiology
Liu, Zhengliang, Zhong, Aoxiao, Li, Yiwei, Yang, Longtao, Ju, Chao, Wu, Zihao, Ma, Chong, Shu, Peng, Chen, Cheng, Kim, Sekeun, Dai, Haixing, Zhao, Lin, Zhu, Dajiang, Liu, Jun, Liu, Wei, Shen, Dinggang, Li, Xiang, Li, Quanzheng, Liu, Tianming
We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst for future developments in clinical NLP. The successful implementation of Radiology-GPT is indicative of the potential of localizing generative large language models, specifically tailored for distinctive medical specialties, while ensuring adherence to privacy standards such as HIPAA. The prospect of developing individualized, large-scale language models that cater to specific needs of various hospitals presents a promising direction. The fusion of conversational competence and domain-specific knowledge in these models is set to foster future development in healthcare AI. A demo of Radiology-GPT is available at https://huggingface.co/spaces/allen-eric/radiology-gpt.
SAM for Poultry Science
Yang, Xiao, Dai, Haixing, Wu, Zihao, Bist, Ramesh, Subedi, Sachin, Sun, Jin, Lu, Guoyu, Li, Changying, Liu, Tianming, Chai, Lilong
In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricultural applications, its potential in the poultry industry, specifically in the context of cage-free hens, remains relatively unexplored. This study aims to assess the zero-shot segmentation performance of SAM on representative chicken segmentation tasks, including part-based segmentation and the use of infrared thermal images, and to explore chicken-tracking tasks by using SAM as a segmentation tool. The results demonstrate SAM's superior performance compared to SegFormer and SETR in both whole and part-based chicken segmentation. SAM-based object tracking also provides valuable data on the behavior and movement patterns of broiler birds. The findings of this study contribute to a better understanding of SAM's potential in poultry science and lay the foundation for future advancements in chicken segmentation and tracking.
Prompt Engineering for Healthcare: Methodologies and Applications
Wang, Jiaqi, Shi, Enze, Yu, Sigang, Wu, Zihao, Ma, Chong, Dai, Haixing, Yang, Qiushi, Kang, Yanqing, Wu, Jinru, Hu, Huawen, Yue, Chenxi, Zhang, Haiyang, Liu, Yiheng, Li, Xiang, Ge, Bao, Zhu, Dajiang, Yuan, Yixuan, Shen, Dinggang, Liu, Tianming, Zhang, Shu
This review will introduce the latest advances in prompt engineering in the field of natural language processing (NLP) for the medical domain. First, we will provide a brief overview of the development of prompt engineering and emphasize its significant contributions to healthcare NLP applications such as question-answering systems, text summarization, and machine translation. With the continuous improvement of general large language models, the importance of prompt engineering in the healthcare domain is becoming increasingly prominent. The aim of this article is to provide useful resources and bridges for healthcare NLP researchers to better explore the application of prompt engineering in this field. We hope that this review can provide new ideas and inspire ample possibilities for research and application in medical NLP.
Differentiate ChatGPT-generated and Human-written Medical Texts
Liao, Wenxiong, Liu, Zhengliang, Dai, Haixing, Xu, Shaochen, Wu, Zihao, Zhang, Yiyang, Huang, Xiaoke, Zhu, Dajiang, Cai, Hongmin, Liu, Tianming, Li, Xiang
Background: Large language models such as ChatGPT are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the Internet. However, medical texts such as clinical notes and diagnoses require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses significant harm to healthcare and the general public. Objective: This research is among the first studies on responsible and ethical AIGC (Artificial Intelligence Generated Content) in medicine. We focus on analyzing the differences between medical texts written by human experts and generated by ChatGPT, and designing machine learning workflows to effectively detect and differentiate medical texts generated by ChatGPT. Methods: We first construct a suite of datasets containing medical texts written by human experts and generated by ChatGPT. In the next step, we analyze the linguistic features of these two types of content and uncover differences in vocabulary, part-of-speech, dependency, sentiment, perplexity, etc. Finally, we design and implement machine learning methods to detect medical text generated by ChatGPT. Results: Medical texts written by humans are more concrete, more diverse, and typically contain more useful information, while medical texts generated by ChatGPT pay more attention to fluency and logic, and usually express general terminologies rather than effective information specific to the context of the problem. A BERT-based model can effectively detect medical texts generated by ChatGPT, and the F1 exceeds 95%.