AITopics | Ge, Yao

Collaborating Authors

Ge, Yao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HILGEN: Hierarchically-Informed Data Generation for Biomedical NER Using Knowledgebases and Large Language Models

Ge, Yao, Guo, Yuting, Das, Sudeshna, Rajwal, Swati, Bozkurt, Selen, Sarker, Abeed

arXiv.org Artificial IntelligenceMar-6-2025

We present HILGEN, a Hierarchically-Informed Data Generation approach that combines domain knowledge from the Unified Medical Language System (UMLS) with synthetic data generated by large language models (LLMs), specifically GPT-3.5. Our approach leverages UMLS's hierarchical structure to expand training data with related concepts, while incorporating contextual information from LLMs through targeted prompts aimed at automatically generating synthetic examples for sparsely occurring named entities. The performance of the HILGEN approach was evaluated across four biomedical NER datasets (MIMIC III, BC5CDR, NCBI-Disease, and Med-Mentions) using BERT-Large and DANN (Data Augmentation with Nearest Neighbor Classifier) models, applying various data generation strategies, including UMLS, GPT-3.5, and their best ensemble. For the BERT-Large model, incorporating UMLS led to an average F1 score improvement of 40.36%, while using GPT-3.5 resulted in a comparable average increase of 40.52%. The Best-Ensemble approach using BERT-Large achieved the highest improvement, with an average increase of 42.29%. DANN model's F1 score improved by 22.74% on average using the UMLS-only approach. The GPT-3.5-based method resulted in a 21.53% increase, and the Best-Ensemble DANN model showed a more notable improvement, with an average increase of 25.03%. Our proposed HILGEN approach improves NER performance in few-shot settings without requiring additional manually annotated data. Our experiments demonstrate that an effective strategy for optimizing biomedical NER is to combine biomedical knowledge curated in the past, such as the UMLS, and generative LLMs to create synthetic training instances. Our future research will focus on exploring additional innovative synthetic data generation strategies for further improving NER performance.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.0493

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data

Das, Sudeshna, Ge, Yao, Guo, Yuting, Rajwal, Swati, Hairston, JaMor, Powell, Jeanne, Walker, Drew, Peddireddy, Snigdha, Lakamana, Sahithi, Bozkurt, Selen, Reyna, Matthew, Sameni, Reza, Xiao, Yunyu, Kim, Sangmi, Chandler, Rasheeta, Hernandez, Natalie, Mowery, Danielle, Wightman, Rachel, Love, Jennifer, Spadaro, Anthony, Perrone, Jeanmarie, Sarker, Abeed

arXiv.org Artificial IntelligenceMay-29-2024

Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.

large language model, machine learning, xylazine, (21 more...)

arXiv.org Artificial Intelligence

2405.19519

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.48)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media

Ge, Yao, Das, Sudeshna, O'Connor, Karen, Al-Garadi, Mohammed Ali, Gonzalez-Hernandez, Graciela, Sarker, Abeed

arXiv.org Artificial IntelligenceMay-9-2024

Substance use disorders (SUDs) are a growing concern globally, necessitating enhanced understanding of the problem and its trends through data-driven research. Social media are unique and important sources of information about SUDs, particularly since the data in such sources are often generated by people with lived experiences. In this paper, we introduce Reddit-Impacts, a challenging Named Entity Recognition (NER) dataset curated from subreddits dedicated to discussions on prescription and illicit opioids, as well as medications for opioid use disorder. The dataset specifically concentrates on the lesser-studied, yet critically important, aspects of substance use--its clinical and social impacts. We collected data from chosen subreddits using the publicly available Application Programming Interface for Reddit. We manually annotated text spans representing clinical and social impacts reported by people who also reported personal nonmedical use of substances including but not limited to opioids, stimulants and benzodiazepines. Our objective is to create a resource that can enable the development of systems that can automatically detect clinical and social impacts of substance use from text-based social media data. The successful development of such systems may enable us to better understand how nonmedical use of substances affects individual health and societal dynamics, aiding the development of effective public health strategies. In addition to creating the annotated data set, we applied several machine learning models to establish baseline performances. Specifically, we experimented with transformer models like BERT, and RoBERTa, one few-shot learning model DANN by leveraging the full training dataset, and GPT-3.5 by using one-shot learning, for automatic NER of clinical and social impacts. The dataset has been made available through the 2024 SMM4H shared tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.06145

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > California > Los Angeles County (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.89)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Scalable Attribution of Adversarial Attacks via Multi-Task Learning

Guo, Zhongyi, Han, Keji, Ge, Yao, Ji, Wei, Li, Yun

arXiv.org Artificial IntelligenceFeb-25-2023

Deep neural networks (DNNs) can be easily fooled by adversarial attacks during inference phase when attackers add imperceptible perturbations to original examples, i.e., adversarial examples. Many works focus on adversarial detection and adversarial training to defend against adversarial attacks. However, few works explore the tool-chains behind adversarial examples, which can help defenders to seize the clues about the originator of the attack, their goals, and provide insight into the most effective defense algorithm against corresponding attacks. With such a gap, it is necessary to develop techniques that can recognize tool-chains that are leveraged to generate the adversarial examples, which is called Adversarial Attribution Problem (AAP). In this paper, AAP is defined as the recognition of three signatures, i.e., {\em attack algorithm}, {\em victim model} and {\em hyperparameter}. Current works transfer AAP into single label classification task and ignore the relationship between these signatures. The former will meet combination explosion problem as the number of signatures is increasing. The latter dictates that we cannot treat AAP simply as a single task problem. We first conduct some experiments to validate the attributability of adversarial examples. Furthermore, we propose a multi-task learning framework named Multi-Task Adversarial Attribution (MTAA) to recognize the three signatures simultaneously. MTAA contains perturbation extraction module, adversarial-only extraction module and classification and regression module. It takes the relationship between attack algorithm and corresponding hyperparameter into account and uses the uncertainty weighted loss to adjust the weights of three recognition tasks. The experimental results on MNIST and ImageNet show the feasibility and scalability of the proposed framework as well as its effectiveness in dealing with false alarms.

artificial intelligence, attack algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.14059

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback