Liang, Gongbo
Benchmarking Robustness of Contrastive Learning Models for Medical Image-Report Retrieval
Deanda, Demetrio, Masupalli, Yuktha Priya, Yang, Jeong, Lee, Young, Cao, Zechun, Liang, Gongbo
Medical images and reports offer invaluable insights into patient health. The heterogeneity and complexity of these data hinder effective analysis. To bridge this gap, we investigate contrastive learning models for cross-domain retrieval, which associates medical images with their corresponding clinical reports. This study benchmarks the robustness of four state-of-the-art contrastive learning models: CLIP, CXR-RePaiR, MedCLIP, and CXR-CLIP. We introduce an occlusion retrieval task to evaluate model performance under varying levels of image corruption. Our findings reveal that all evaluated models are highly sensitive to out-of-distribution data, as evidenced by the proportional decrease in performance with increasing occlusion levels. While MedCLIP exhibits slightly more robustness, its overall performance remains significantly behind CXR-CLIP and CXR-RePaiR. CLIP, trained on a general-purpose dataset, struggles with medical image-report retrieval, highlighting the importance of domain-specific training data. The evaluation of this work suggests that more effort needs to be spent on improving the robustness of these models. By addressing these limitations, we can develop more reliable cross-domain retrieval models for medical applications.
Exploring the Potential of Large Language Models in Public Transportation: San Antonio Case Study
Jonnala, Ramya, Liang, Gongbo, Yang, Jeong, Alsmadi, Izzat
The integration of large language models (LLMs) into public transit systems presents a transformative opportunity to enhance urban mobility. This study explores the potential of LLMs to revolutionize public transportation management within the context of San Antonio's transit system. Leveraging the capabilities of LLMs in natural language processing and data analysis, we investigate their capabilities to optimize route planning, reduce wait times, and provide personalized travel assistance. By utilizing the General Transit Feed Specification (GTFS) and other relevant data, this research aims to demonstrate how LLMs can potentially improve resource allocation, elevate passenger satisfaction, and inform data-driven decision-making in transit operations. A comparative analysis of different ChatGPT models was conducted to assess their ability to understand transportation information, retrieve relevant data, and provide comprehensive responses. Findings from this study suggest that while LLMs hold immense promise for public transit, careful engineering and fine-tuning are essential to realizing their full potential. San Antonio serves as a case study to inform the development of LLM-powered transit systems in other urban environments.
Using Large Language Models in Public Transit Systems, San Antonio as a case study
Jonnala, Ramya, Liang, Gongbo, Yang, Jeong, Alsmadi, Izzat
The integration of large language models into public transit systems represents a significant advancement in urban transportation management and passenger experience. This study examines the impact of LLMs within San Antonio's public transit system, leveraging their capabilities in natural language processing, data analysis, and real time communication. By utilizing GTFS and other public transportation information, the research highlights the transformative potential of LLMs in enhancing route planning, reducing wait times, and providing personalized travel assistance. Our case study is the city of San Antonio as part of a project aiming to demonstrate how LLMs can optimize resource allocation, improve passenger satisfaction, and support decision making processes in transit management. We evaluated LLM responses to questions related to both information retrieval and also understanding. Ultimately, we believe that the adoption of LLMs in public transit systems can lead to more efficient, responsive, and user-friendly transportation networks, providing a model for other cities to follow.
Mutation-Based Adversarial Attacks on Neural Text Detectors
Liang, Gongbo, Guerrero, Jesus, Alsmadi, Izzat
Neural text detectors aim to decide the characteristics that distinguish neural (machine-generated) from human texts. To challenge such detectors, adversarial attacks can alter the statistical characteristics of the generated text, making the detection task more and more difficult. Inspired by the advances of mutation analysis in software development and testing, in this paper, we propose character- and word-based mutation operators for generating adversarial samples to attack state-of-the-art natural text detectors. This falls under white-box adversarial attacks. In such attacks, attackers have access to the original text and create mutation instances based on this original text. The ultimate goal is to confuse machine learning models and classifiers and decrease their prediction accuracy.
A Mutation-based Text Generation for Adversarial Machine Learning Applications
Guerrero, Jesus, Liang, Gongbo, Alsmadi, Izzat
Currently, text generation is widely used in Machine Learning (ML)-based or Artificial Intelligence (AI)-based natural language applications such as language to language translation, document summary, headline or abstract generation. Those applications can be classified into different categories. In one classification, they can be divided into short versus long text generation applications. Short text generation applications include examples such as predicting next word or statement, image caption generation, short language translation, and documents summarization. Long text generation applications include long text story completion, review generation, language translation, poetry generation, and question answering.
Dynamic Image for 3D MRI Image Alzheimer's Disease Classification
Xing, Xin, Liang, Gongbo, Blanton, Hunter, Rafique, Muhammad Usman, Wang, Chris, Lin, Ai-Ling, Jacobs, Nathan
We propose to apply a 2D CNN architecture to 3D MRI image Alzheimer's disease classification. Training a 3D convolutional neural network (CNN) is time-consuming and computationally expensive. We make use of approximate rank pooling to transform the 3D MRI image volume into a 2D image to use as input to a 2D CNN. We show our proposed CNN model achieves $9.5\%$ better Alzheimer's disease classification accuracy than the baseline 3D models. We also show that our method allows for efficient training, requiring only 20% of the training time compared to 3D CNN models. The code is available online: https://github.com/UkyVision/alzheimer-project.