AITopics | He, Yongqun

Collaborating Authors

He, Yongqun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cancer Vaccine Adjuvant Name Recognition from Biomedical Literature using Large Language Models

Rehana, Hasin, Zheng, Jie, Yeh, Leo, Bansal, Benu, Çam, Nur Bengisu, Jemiyo, Christianah, McGregor, Brett, Özgür, Arzucan, He, Yongqun, Hur, Junguk

arXiv.org Artificial IntelligenceFeb-12-2025

Motivation: An adjuvant is a chemical incorporated into vaccines that enhances their efficacy by improving the immune response. Identifying adjuvant names from cancer vaccine studies is essential for furthering research and enhancing immunotherapies. However, the manual curation from the constantly expanding biomedical literature poses significant challenges. This study explores the automated recognition of vaccine adjuvant names using Large Language Models (LLMs), specifically Generative Pretrained Transformers (GPT) and Large Language Model Meta AI (Llama). Methods: We utilized two datasets: 97 clinical trial records from AdjuvareDB and 290 abstracts annotated with the Vaccine Adjuvant Compendium (VAC). GPT-4o and Llama 3.2 were employed in zero-shot and few-shot learning paradigms with up to four examples per prompt. Prompts explicitly targeted adjuvant names, testing the impact of contextual information such as substances or interventions. Outputs underwent automated and manual validation for accuracy and consistency. Results: GPT-4o attained 100% Precision across all situations while exhibiting notable improve in Recall and F1-scores, particularly with incorporating interventions. On the VAC dataset, GPT-4o achieved a maximum F1-score of 77.32% with interventions, surpassing Llama-3.2-3B by approximately 2%. On the AdjuvareDB dataset, GPT-4o reached an F1-score of 81.67% for three-shot prompting with interventions, surpassing Llama-3.2-3 B's maximum F1-score of 65.62%. Conclusion: Our findings demonstrate that LLMs excel at identifying adjuvant names, including rare variations of naming representation. This study emphasizes the capability of LLMs to enhance cancer vaccine development by efficiently extracting insights. Future work aims to broaden the framework to encompass various biomedical literature and enhance model generalizability across various vaccines and adjuvants.

adjuvant, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.09659

Country: North America > United States > North Dakota > Grand Forks County > Grand Forks (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Fourfold Pathogen Reference Ontology Suite

Babcock, Shane, Benson, Carter, De Colle, Giacomo, Cohen, Sydney, Diehl, Alexander D., Challa, Ram A. N. R., Huffman, Anthony, He, Yongqun, Beverley, John

arXiv.org Artificial IntelligenceDec-30-2024

Infectious diseases remain a critical global health challenge, and the integration of standardized ontologies plays a vital role in managing related data. The Infectious Disease Ontology (IDO) and its extensions, such as the Coronavirus Infectious Disease Ontology (CIDO), are essential for organizing and disseminating information related to infectious diseases. The COVID-19 pandemic highlighted the need for updating IDO and its virus-specific extensions. There is an additional need to update IDO extensions specific to bacteria, fungus, and parasite infectious diseases. We adopt the "hub and spoke" methodology to generate pathogen-specific extensions of IDO: Virus Infectious Disease Ontology (VIDO), Bacteria Infectious Disease Ontology (BIDO), Mycosis Infectious Disease Ontology (MIDO), and Parasite Infectious Disease Ontology (PIDO). The creation of pathogen-specific reference ontologies advances modularization and reusability of infectious disease data within the IDO ecosystem. Future work will focus on further refining these ontologies, creating new extensions, and developing application ontologies based on them, in line with ongoing efforts to standardize biological and biomedical terminologies for improved data sharing and analysis.

artificial intelligence, infectious disease ontology, ontology, (10 more...)

arXiv.org Artificial Intelligence

2501.01454

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases > Fungal Infection (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Credentials in the Occupation Ontology

Beverley, John, McGill, Robin, Smith, Sam, Zheng, Jie, De Colle, Giacomo, Wilson, Finn, Diller, Matthew, Duncan, William D., Hogan, William R., He, Yongqun

arXiv.org Artificial IntelligenceApr-30-2024

The term credential encompasses educational certificates, degrees, certifications, and government-issued licenses. An occupational credential is a verification of an individuals qualification or competence issued by a third party with relevant authority. Job seekers often leverage such credentials as evidence that desired qualifications are satisfied by their holders. Many U.S. education and workforce development organizations have recognized the importance of credentials for employment and the challenges of understanding the value of credentials. In this study, we identified and ontologically defined credential and credential-related terms at the textual and semantic levels based on the Occupation Ontology (OccO), a BFO-based ontology. Different credential types and their authorization logic are modeled. We additionally defined a high-level hierarchy of credential related terms and relations among many terms, which were initiated in concert with the Alabama Talent Triad (ATT) program, which aims to connect learners, earners, employers and education/training providers through credentials and skills. To our knowledge, our research provides for the first time systematic ontological modeling of the important domain of credentials and related contents, supporting enhanced credential data and knowledge integration in the future.

artificial intelligence, credential, holder, (14 more...)

arXiv.org Artificial Intelligence

2405.00186

Country:

North America > United States > Alabama (0.29)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.95)
Education > Educational Setting (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Grounding Realizable Entities

Rabenberg, Michael, Benson, Carter, Donato, Federico, He, Yongqun, Huffman, Anthony, Babcock, Shane, Beverley, John

arXiv.org Artificial IntelligenceApr-30-2024

Ontological representations of qualities, dispositions, and roles have been refined over the past decade, clarifying subtle distinctions in life science research. After articulating a widely-used characterization of these entities within the context of Basic Formal Ontology (BFO), we identify gaps in this treatment and motivate the need for supplementing the BFO characterization. By way of supplement, we propose definitions for grounding relations holding between qualities and dispositions, and dispositions and roles, illustrating our proposal by representing subtle aspects of host-pathogen interactions.

artificial intelligence, disposition, realizable entity, (12 more...)

arXiv.org Artificial Intelligence

2405.00197

Country: North America > United States (0.69)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.94)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Government (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Evaluation of GPT and BERT-based models on identifying protein-protein interactions in biomedical text

Rehana, Hasin, Çam, Nur Bengisu, Basmaci, Mert, Zheng, Jie, Jemiyo, Christianah, He, Yongqun, Özgür, Arzucan, Hur, Junguk

arXiv.org Artificial IntelligenceDec-12-2023

Detecting protein-protein interactions (PPIs) is crucial for understanding genetic mechanisms, disease pathogenesis, and drug design. However, with the fast-paced growth of biomedical literature, there is a growing need for automated and accurate extraction of PPIs to facilitate scientific knowledge discovery. Pre-trained language models, such as generative pre-trained transformers (GPT) and bidirectional encoder representations from transformers (BERT), have shown promising results in natural language processing (NLP) tasks. We evaluated the performance of PPI identification of multiple GPT and BERT models using three manually curated gold-standard corpora: Learning Language in Logic (LLL) with 164 PPIs in 77 sentences, Human Protein Reference Database with 163 PPIs in 145 sentences, and Interaction Extraction Performance Assessment with 335 PPIs in 486 sentences. BERT-based models achieved the best overall performance, with BioBERT achieving the highest recall (91.95%) and F1-score (86.84%) and PubMedBERT achieving the highest precision (85.25%). Interestingly, despite not being explicitly trained for biomedical texts, GPT-4 achieved commendable performance, comparable to the top-performing BERT models. It achieved a precision of 88.37%, a recall of 85.14%, and an F1-score of 86.49% on the LLL dataset. These results suggest that GPT models can effectively detect PPIs from text data, offering promising avenues for application in biomedical literature mining. Further research could explore how these models might be fine-tuned for even more specialized tasks within the biomedical domain.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.17728

Country:

North America > United States > Michigan (0.28)
North America > United States > North Dakota > Grand Forks County > Grand Forks (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OntoPlot: A Novel Visualisation for Non-hierarchical Associations in Large Ontologies

Yang, Ying, Wybrow, Michael, Li, Yuan-Fang, Czauderna, Tobias, He, Yongqun

arXiv.org Artificial IntelligenceAug-1-2019

Ontologies are formal representations of concepts and complex relationships among them. They have been widely used to capture comprehensive domain knowledge in areas such as biology and medicine, where large and complex ontologies can contain hundreds of thousands of concepts. Especially due to the large size of ontologies, visualisation is useful for authoring, exploring and understanding their underlying data. Existing ontology visualisation tools generally focus on the hierarchical structure, giving much less emphasis to non-hierarchical associations. In this paper we present OntoPlot, a novel visualisation specifically designed to facilitate the exploration of all concept associations whilst still showing an ontology's large hierarchical structure. This hybrid visualisation combines icicle plots, visual compression techniques and interactivity, improving space-efficiency and reducing visual structural complexity. We conducted a user study with domain experts to evaluate the usability of OntoPlot, comparing it with the de facto ontology editor Prot{\'e}g{\'e}. The results confirm that OntoPlot attains our design goals for association-related tasks and is strongly favoured by domain experts.

cardiology, ontology, vascular disease, (21 more...)

arXiv.org Artificial Intelligence

1908.00688

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback