Zhao, Lin
Identification of Causal Relationship between Amyloid-beta Accumulation and Alzheimer's Disease Progression via Counterfactual Inference
Dai, Haixing, Hu, Mengxuan, Li, Qing, Zhang, Lu, Zhao, Lin, Zhu, Dajiang, Diez, Ibai, Sepulcre, Jorge, Zhang, Fan, Gao, Xingyu, Liu, Manhua, Li, Quanzheng, Li, Sheng, Liu, Tianming, Li, Xiang
Alzheimer's disease (AD) is a neurodegenerative disorder that is beginning with amyloidosis, followed by neuronal loss and deterioration in structure, function, and cognition. The accumulation of amyloid-beta in the brain, measured through 18F-florbetapir (AV45) positron emission tomography (PET) imaging, has been widely used for early diagnosis of AD. However, the relationship between amyloid-beta accumulation and AD pathophysiology remains unclear, and causal inference approaches are needed to uncover how amyloid-beta levels can impact AD development. In this paper, we propose a graph varying coefficient neural network (GVCNet) for estimating the individual treatment effect with continuous treatment levels using a graph convolutional neural network. We highlight the potential of causal inference approaches, including GVCNet, for measuring the regional causal connections between amyloid-beta accumulation and AD pathophysiology, which may serve as a robust tool for early diagnosis and tailored care.
Review of Large Vision Models and Visual Prompt Engineering
Wang, Jiaqi, Liu, Zhengliang, Zhao, Lin, Wu, Zihao, Ma, Chong, Yu, Sigang, Dai, Haixing, Yang, Qiushi, Liu, Yiheng, Zhang, Songyao, Shi, Enze, Pan, Yi, Zhang, Tuo, Zhu, Dajiang, Li, Xiang, Jiang, Xi, Ge, Bao, Yuan, Yixuan, Shen, Dinggang, Liu, Tianming, Zhang, Shu
Visual prompt engineering is a fundamental technology in the field of visual and image Artificial General Intelligence, serving as a key component for achieving zero-shot capabilities. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research direction. This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models. It is our hope that this review provides a comprehensive and systematic description of prompt engineering methods based on large visual models, offering valuable insights for future researchers in their exploration of this field.
Artificial General Intelligence for Medical Imaging
Li, Xiang, Zhang, Lu, Wu, Zihao, Liu, Zhengliang, Zhao, Lin, Yuan, Yixuan, Liu, Jun, Li, Gang, Zhu, Dajiang, Yan, Pingkun, Li, Quanzheng, Liu, Wei, Liu, Tianming, Shen, Dinggang
In this review, we explore the potential applications of Artificial General Intelligence (AGI) models in healthcare, focusing on foundational Large Language Models (LLMs), Large Vision Models, and Large Multimodal Models. We emphasize the importance of integrating clinical expertise, domain knowledge, and multimodal capabilities into AGI models. In addition, we lay out key roadmaps that guide the development and deployment of healthcare AGI models. Throughout the review, we provide critical perspectives on the potential challenges and pitfalls associated with deploying large-scale AGI models in the medical field. This comprehensive review aims to offer insights into the future implications of AGI in medical imaging, healthcare and beyond.
AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology
Dai, Haixing, Li, Yiwei, Liu, Zhengliang, Zhao, Lin, Wu, Zihao, Song, Suhang, Shen, Ye, Zhu, Dajiang, Li, Xiang, Li, Sheng, Yao, Xiaobai, Shi, Lu, Li, Quanzheng, Chen, Zhuo, Zhang, Donglan, Mai, Gengchen, Liu, Tianming
This disease, characterized by cognitive impairments such as memory loss, predominantly affects aging populations, exerting an escalating burden on global healthcare systems as societies continue to age [3]. The significance of AD is further magnified by the increasing life expectancy globally, with the disease now recognized as a leading cause of disability and dependency among older people [4]. Consequently, AD has substantial social, economic, and health system implications, making its understanding and awareness of paramount importance [5, 6]. Despite the ubiquity and severity of AD, a gap persists in comprehensive, data-driven public understanding of this complex health narrative. Traditionally, public health professionals have to rely on labor-intensive methods such as web scraping, API data collection, data postprocessing, and analysis/synthesis to gather insights from news media, health reports, and other textual sources [7, 8, 9].
Radiology-GPT: A Large Language Model for Radiology
Liu, Zhengliang, Zhong, Aoxiao, Li, Yiwei, Yang, Longtao, Ju, Chao, Wu, Zihao, Ma, Chong, Shu, Peng, Chen, Cheng, Kim, Sekeun, Dai, Haixing, Zhao, Lin, Zhu, Dajiang, Liu, Jun, Liu, Wei, Shen, Dinggang, Li, Xiang, Li, Quanzheng, Liu, Tianming
We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst for future developments in clinical NLP. The successful implementation of Radiology-GPT is indicative of the potential of localizing generative large language models, specifically tailored for distinctive medical specialties, while ensuring adherence to privacy standards such as HIPAA. The prospect of developing individualized, large-scale language models that cater to specific needs of various hospitals presents a promising direction. The fusion of conversational competence and domain-specific knowledge in these models is set to foster future development in healthcare AI. A demo of Radiology-GPT is available at https://huggingface.co/spaces/allen-eric/radiology-gpt.
Motion comfort and driver feel: An explorative study about their relation in remote driving
Papaioannou, Georgios, Zhao, Lin, Nybacka, Mikael, Jerrelind, Jenny, Happee, Riender, Drugge, Lars
Teleoperation is considered as a viable option to control fully automated vehicles (AVs) of Level 4 and 5 in special conditions. However, by bringing the remote drivers in the loop, their driving experience should be realistic to secure safe and comfortable remote control.Therefore, the remote control tower should be designed such that remote drivers receive high quality cues regarding the vehicle state and the driving environment. In this direction, the steering feedback could be manipulated to provide feedback to the remote drivers regarding how the vehicle reacts to their commands. However, until now, it is unclear how the remote drivers' steering feel could impact occupant's motion comfort. This paper focuses on exploring how the driver feel in remote (RD) and normal driving (ND) are related with motion comfort. More specifically, different types of steering feedback controllers are applied in (a) the steering system of a Research Concept Vehicle-model E (RCV-E) and (b) the steering system of a remote control tower. An experiment was performed to assess driver feel when the RCV-E is normally and remotely driven. Subjective assessment and objective metrics are employed to assess drivers' feel and occupants' motion comfort in both remote and normal driving scenarios. The results illustrate that motion sickness and ride comfort are only affected by the steering velocity in remote driving, while throttle input variations affect them in normal driving. The results demonstrate that motion sickness and steering velocity increase both around 25$\%$ from normal to remote driving.
Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task
Wu, Zihao, Zhang, Lu, Cao, Chao, Yu, Xiaowei, Dai, Haixing, Ma, Chong, Liu, Zhengliang, Zhao, Lin, Li, Gang, Liu, Wei, Li, Quanzheng, Shen, Dinggang, Li, Xiang, Zhu, Dajiang, Liu, Tianming
Recently, ChatGPT and GPT-4 have emerged and gained immense global attention due to their unparalleled performance in language processing. Despite demonstrating impressive capability in various open-domain tasks, their adequacy in highly specific fields like radiology remains untested. Radiology presents unique linguistic phenomena distinct from open-domain data due to its specificity and complexity. Assessing the performance of large language models (LLMs) in such specific domains is crucial not only for a thorough evaluation of their overall performance but also for providing valuable insights into future model design directions: whether model design should be generic or domain-specific. To this end, in this study, we evaluate the performance of ChatGPT/GPT-4 on a radiology NLI task and compare it to other models fine-tuned specifically on task-related data samples. We also conduct a comprehensive investigation on ChatGPT/GPT-4's reasoning ability by introducing varying levels of inference difficulty. Our results show that 1) GPT-4 outperforms ChatGPT in the radiology NLI task; 2) other specifically fine-tuned models require significant amounts of data samples to achieve comparable performance to ChatGPT/GPT-4. These findings demonstrate that constructing a generic model that is capable of solving various tasks across different domains is feasible.
When Brain-inspired AI Meets AGI
Zhao, Lin, Zhang, Lu, Wu, Zihao, Chen, Yuzhong, Dai, Haixing, Yu, Xiaowei, Liu, Zhengliang, Zhang, Tuo, Hu, Xintao, Jiang, Xi, Li, Xiang, Zhu, Dajiang, Shen, Dinggang, Liu, Tianming
Artificial General Intelligence (AGI) has been a long-standing goal of humanity, with the aim of creating machines capable of performing any intellectual task that humans can do. To achieve this, AGI researchers draw inspiration from the human brain and seek to replicate its principles in intelligent machines. Brain-inspired artificial intelligence is a field that has emerged from this endeavor, combining insights from neuroscience, psychology, and computer science to develop more efficient and powerful AI systems. In this article, we provide a comprehensive overview of brain-inspired AI from the perspective of AGI. We begin with the current progress in brain-inspired AI and its extensive connection with AGI. We then cover the important characteristics for both human intelligence and AGI (e.g., scaling, multimodality, and reasoning). We discuss important technologies toward achieving AGI in current AI systems, such as in-context learning and prompt tuning. We also investigate the evolution of AGI systems from both algorithmic and infrastructural perspectives. Finally, we explore the limitations and future of AGI.
Core-Periphery Principle Guided Redesign of Self-Attention in Transformers
Yu, Xiaowei, Zhang, Lu, Dai, Haixing, Lyu, Yanjun, Zhao, Lin, Wu, Zihao, Liu, David, Liu, Tianming, Zhu, Dajiang
Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance in either machine learning or cognitive/behavior tasks. Inspired by this phenomenon, we proactively instill organizational principles of BNNs to guide the redesign of ANNs. We leverage the Core-Periphery (CP) organization, which is widely found in human brain networks, to guide the information communication mechanism in the self-attention of vision transformer (ViT) and name this novel framework as CP-ViT. In CP-ViT, the attention operation between nodes is defined by a sparse graph with a Core-Periphery structure (CP graph), where the core nodes are redesigned and reorganized to play an integrative role and serve as a center for other periphery nodes to exchange information. We evaluated the proposed CP-ViT on multiple public datasets, including medical image datasets (INbreast) and natural image datasets. Interestingly, by incorporating the BNN-derived principle (CP structure) into the redesign of ViT, our CP-ViT outperforms other state-of-the-art ANNs. In general, our work advances the state of the art in three aspects: 1) This work provides novel insights for brain-inspired AI: we can utilize the principles found in BNNs to guide and improve our ANN architecture design; 2) We show that there exist sweet spots of CP graphs that lead to CP-ViTs with significantly improved performance; and 3) The core nodes in CP-ViT correspond to task-related meaningful and important image patches, which can significantly enhance the interpretability of the trained deep model.
Coupling Artificial Neurons in BERT and Biological Neurons in the Human Brain
Liu, Xu, Zhou, Mengyue, Shi, Gaosheng, Du, Yu, Zhao, Lin, Wu, Zihao, Liu, David, Liu, Tianming, Hu, Xintao
Linking computational natural language processing (NLP) models and neural responses to language in the human brain on the one hand facilitates the effort towards disentangling the neural representations underpinning language perception, on the other hand provides neurolinguistics evidence to evaluate and improve NLP models. Mappings of an NLP model's representations of and the brain activities evoked by linguistic input are typically deployed to reveal this symbiosis. However, two critical problems limit its advancement: 1) The model's representations (artificial neurons, ANs) rely on layer-level embeddings and thus lack fine-granularity; 2) The brain activities (biological neurons, BNs) are limited to neural recordings of isolated cortical unit (i.e., voxel/region) and thus lack integrations and interactions among brain functions. To address those problems, in this study, we 1) define ANs with fine-granularity in transformer-based NLP models (BERT in this study) and measure their temporal activations to input text sequences; 2) define BNs as functional brain networks (FBNs) extracted from functional magnetic resonance imaging (fMRI) data to capture functional interactions in the brain; 3) couple ANs and BNs by maximizing the synchronization of their temporal activations. Our experimental results demonstrate 1) The activations of ANs and BNs are significantly synchronized; 2) the ANs carry meaningful linguistic/semantic information and anchor to their BN signatures; 3) the anchored BNs are interpretable in a neurolinguistic context. Overall, our study introduces a novel, general, and effective framework to link transformer-based NLP models and neural activities in response to language and may provide novel insights for future studies such as brain-inspired evaluation and development of NLP models.