AITopics | Wu, Yonghui

Collaborating Authors

Wu, Yonghui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identifying Symptoms of Delirium from Clinical Narratives Using Natural Language Processing

Chen, Aokun, Paredes, Daniel, Yu, Zehao, Lou, Xiwei, Brunson, Roberta, Thomas, Jamie N., Martinez, Kimberly A., Lucero, Robert J., Magoc, Tanja, Solberg, Laurence M., Snigurska, Urszula A., Ser, Sarah E., Prosperi, Mattia, Bian, Jiang, Bjarnadottir, Ragnhildur I., Wu, Yonghui

arXiv.org Artificial IntelligenceMar-31-2023

Delirium is an acute decline or fluctuation in attention, awareness, or other cognitive function that can lead to serious adverse outcomes. Despite the severe outcomes, delirium is frequently unrecognized and uncoded in patients' electronic health records (EHRs) due to its transient and diverse nature. Natural language processing (NLP), a key technology that extracts medical concepts from clinical narratives, has shown great potential in studies of delirium outcomes and symptoms. To assist in the diagnosis and phenotyping of delirium, we formed an expert panel to categorize diverse delirium symptoms, composed annotation guidelines, created a delirium corpus with diverse delirium symptoms, and developed NLP methods to extract delirium symptoms from clinical notes. We compared 5 state-of-the-art transformer models including 2 models (BERT and RoBERTa) from the general domain and 3 models (BERT_MIMIC, RoBERTa_MIMIC, and GatorTron) from the clinical domain. GatorTron achieved the best strict and lenient F1 scores of 0.8055 and 0.8759, respectively. We conducted an error analysis to identify challenges in annotating delirium symptoms and developing NLP systems. To the best of our knowledge, this is the first large language model-based delirium symptom extraction system. Our study lays the foundation for the future development of computable phenotypes and diagnosis methods for delirium.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.00111

Country:

North America > United States > California (0.28)
North America > United States > Florida > Alachua County > Gainesville (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Extracting Thyroid Nodules Characteristics from Ultrasound Reports Using Transformer-based Natural Language Processing Methods

Pathak, Aman, Yu, Zehao, Paredes, Daniel, Monsour, Elio Paul, Rocha, Andrea Ortiz, Brito, Juan P., Ospina, Naykky Singh, Wu, Yonghui

arXiv.org Artificial IntelligenceMar-31-2023

However, the characteristics of thyroid nodules are often documented in clinical narratives such as ultrasound reports. Previous studies have examined natural language processing (NLP) methods in extracting a limited number of characteristics (<9) using rule-based NLP systems. In this study, a multidisciplinary team of NLP experts and thyroid specialists, identified thyroid nodule characteristics that are important for clinical care, composed annotation guidelines, developed a corpus, and compared 5 state-of-the-art transformer-based NLP methods, including BERT, RoBERTa, LongFormer, DeBERTa, and GatorTron, for extraction of thyroid nodule characteristics from ultrasound reports. Our GatorTron model, a transformer-based large language model trained using over 90 billion words of text, achieved the best strict and lenient F1-score of 0.8851 and 0.9495 for the extraction of a total number of 16 thyroid nodule characteristics, and 0.9321 for linking characteristics to nodules, outperforming other clinical transformer models. To the best of our knowledge, this is the first study to systematically categorize and apply transformer-based NLP models to extract a large number of clinical relevant thyroid nodule characteristics from ultrasound reports. This study lays ground for assessing the documentation quality of thyroid ultrasound reports and examining outcomes of patients with thyroid nodules using electronic health records. Introduction Thyroid cancer overdiagnosis is common, harmful, and costly. More than 44,000 new cases of thyroid cancer are expected in the US in 2022.

machine learning, natural language, thyroid nodule, (17 more...)

arXiv.org Artificial Intelligence

2304.00115

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Oncology > Thyroid Cancer (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners

Yan, Shen, Zhu, Tao, Wang, Zirui, Cao, Yuan, Zhang, Mi, Ghosh, Soham, Wu, Yonghui, Yu, Jiahui

arXiv.org Artificial IntelligenceMar-15-2023

Given a well-pretrained imagetext reuses a pretrained image-text contrastive captioner foundation model, it is natural to question whether any (CoCa) model and adapt it to video-text tasks with minimal heavy video-specific adaptor or many video-specific data is extra training. While previous works adapt image-text needed when transferring to video-text modelling models with various cross-frame fusion modules, we find In this paper, we explore an efficient approach to establish that the generative attentional pooling and contrastive attentional a foundational video-text model for tasks including pooling layers in CoCa are instantly adaptable to open-vocabulary video classification, text-to-video retrieval, flattened frame embeddings, yielding state-of-the-art results video captioning and video question-answering. We on zero-shot video classification and zero-shot text-to-video present VideoCoCa, a minimalist approach that extends retrieval. Furthermore, we explore lightweight finetuning the image-text contrastive captioners (CoCa) [68] to videotext on top of VideoCoCa, and achieve strong results on video tasks. The design principle of VideoCoCa is to maximally question-answering and video captioning.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.04979

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.56)

Add feedback

Contextualized Medication Information Extraction Using Transformer-based Deep Learning Architectures

Chen, Aokun, Yu, Zehao, Yang, Xi, Guo, Yi, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceMar-14-2023

Objective: To develop a natural language processing (NLP) system to extract medications and contextual information that help understand drug changes. This project is part of the 2022 n2c2 challenge. Materials and methods: We developed NLP systems for medication mention extraction, event classification (indicating medication changes discussed or not), and context classification to classify medication changes context into 5 orthogonal dimensions related to drug changes. We explored 6 state-of-the-art pretrained transformer models for the three subtasks, including GatorTron, a large language model pretrained using >90 billion words of text (including >80 billion words from >290 million clinical notes identified at the University of Florida Health). We evaluated our NLP systems using annotated data and evaluation scripts provided by the 2022 n2c2 organizers. Results:Our GatorTron models achieved the best F1-scores of 0.9828 for medication extraction (ranked 3rd), 0.9379 for event classification (ranked 2nd), and the best micro-average accuracy of 0.9126 for context classification. GatorTron outperformed existing transformer models pretrained using smaller general English text and clinical text corpora, indicating the advantage of large language models. Conclusion: This study demonstrated the advantage of using large transformer models for contextual medication information extraction from clinical narratives.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jbi.2023.104370

2303.08259

Country:

North America > United States > Florida > Alachua County > Gainesville (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Health & Medicine > Health Care Technology > Medical Record (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Clinical Concept and Relation Extraction Using Prompt-based Machine Reading Comprehension

Peng, Cheng, Yang, Xi, Yu, Zehao, Bian, Jiang, Hogan, William R., Wu, Yonghui

arXiv.org Artificial IntelligenceMar-14-2023

Objective: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. Methods: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using two benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. Results and Conclusion: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the two benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the two datasets by 1%~3% and 0.7%~1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%~2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the two datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications.

extraction, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/jamia/ocad107

2303.08262

Country: North America > United States > Florida > Alachua County > Gainesville (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education > Assessment & Standards > Student Performance (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AnyTOD: A Programmable Task-Oriented Dialog System

Zhao, Jeffrey, Cao, Yuan, Gupta, Raghav, Lee, Harrison, Rastogi, Abhinav, Wang, Mingqiu, Soltau, Hagen, Shafran, Izhak, Wu, Yonghui

arXiv.org Artificial IntelligenceFeb-13-2023

We propose AnyTOD, an end-to-end, zero-shot task-oriented dialog (TOD) system capable of handling unseen tasks without task-specific training. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer as a schema. To enable generalization to unseen schemas and programs without prior training, AnyTOD adopts a neuro-symbolic approach. A neural LM keeps track of events occurring during a conversation and a symbolic program implementing the dialog policy is executed to recommend next actions AnyTOD should take. This approach drastically reduces data annotation and model training requirements, addressing the enduring challenge of rapidly adapting a TOD system to unseen tasks and domains. We demonstrate state-of-the-art results on STAR, ABCD and SGD benchmarks. We also demonstrate strong zero-shot transfer ability in low-resource settings, such as zero-shot on MultiWOZ. In addition, we release STARv2, an updated version of the STAR dataset with richer annotations, for benchmarking zero-shot end-to-end TOD models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2212.09939

Country: Asia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.64)

Add feedback

GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records

Yang, Xi, Chen, Aokun, PourNejatian, Nima, Shin, Hoo Chang, Smith, Kaleb E, Parisien, Christopher, Compas, Colin, Martin, Cheryl, Flores, Mona G, Zhang, Ying, Magoc, Tanja, Harle, Christopher A, Lipori, Gloria, Mitchell, Duane A, Hogan, William R, Shenkman, Elizabeth A, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceDec-16-2022

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model - GatorTron - using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on 5 clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve 5 clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2203.0354

Country: North America > United States > Florida > Alachua County > Gainesville (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Zhang, Yu, Park, Daniel S., Han, Wei, Qin, James, Gulati, Anmol, Shor, Joel, Jansen, Aren, Xu, Yuanzhong, Huang, Yanping, Wang, Shibo, Zhou, Zongwei, Li, Bo, Ma, Min, Chan, William, Yu, Jiahui, Wang, Yongqiang, Cao, Liangliang, Sim, Khe Chai, Ramabhadran, Bhuvana, Sainath, Tara N., Beaufays, Françoise, Chen, Zhifeng, Le, Quoc V., Chiu, Chung-Cheng, Pang, Ruoming, Wu, Yonghui

arXiv.org Artificial IntelligenceJul-21-2022

We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio. We find that the combination of pre-training, self-training and scaling up model size greatly increases data efficiency, even for extremely large tasks with tens of thousands of hours of labeled data. In particular, on an ASR task with 34k hours of labeled data, by fine-tuning an 8 billion parameter pre-trained Conformer model we can match state-of-the-art (SoTA) performance with only 3% of the training data and significantly improve SoTA with the full training set. We also report on the universal benefits gained from using big pre-trained and self-trained models for a large set of downstream tasks that cover a wide range of speech domains and span multiple orders of magnitudes of dataset sizes, including obtaining SoTA performance on many public benchmarks. In addition, we utilize the learned representation of pre-trained networks to achieve SoTA results on non-ASR tasks.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSTSP.2022.3182537

2109.13226

Country: Europe (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Description-Driven Task-Oriented Dialog Modeling

Zhao, Jeffrey, Gupta, Raghav, Cao, Yuan, Yu, Dian, Wang, Mingqiu, Lee, Harrison, Rastogi, Abhinav, Shafran, Izhak, Wu, Yonghui

arXiv.org Artificial IntelligenceJan-21-2022

Task-oriented dialogue (TOD) systems are required to identify key information from conversations for the completion of given tasks. Such information is conventionally specified in terms of intents and slots contained in task-specific ontology or schemata. Since these schemata are designed by system developers, the naming convention for slots and intents is not uniform across tasks, and may not convey their semantics effectively. This can lead to models memorizing arbitrary patterns in data, resulting in suboptimal performance and generalization. In this paper, we propose that schemata should be modified by replacing names or notations entirely with natural language descriptions. We show that a language description-driven system exhibits better understanding of task specifications, higher performance on state tracking, improved data efficiency, and effective zero-shot transfer to unseen tasks. Following this paradigm, we present a simple yet effective Description-Driven Dialog State Tracking (D3ST) model, which relies purely on schema descriptions and an "index-picking" mechanism. We demonstrate the superiority in quality, data efficiency and robustness of our approach as measured on the MultiWOZ (Budzianowski et al.,2018), SGD (Rastogi et al., 2020), and the recent SGD-X (Lee et al., 2021) benchmarks.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2201.08904

Country: Europe > Italy (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.50)

Add feedback

Effective Sequence-to-Sequence Dialogue State Tracking

Zhao, Jeffrey, Mahdieh, Mahdis, Zhang, Ye, Cao, Yuan, Wu, Yonghui

arXiv.org Artificial IntelligenceSep-8-2021

Sequence-to-sequence models have been applied to a wide variety of NLP tasks, but how to properly use them for dialogue state tracking has not been systematically investigated. In this paper, we study this problem from the perspectives of pre-training objectives as well as the formats of context representations. We demonstrate that the choice of pre-training objective makes a significant difference to the state tracking quality. In particular, we find that masked span prediction is more effective than auto-regressive language modeling. We also explore using Pegasus, a span prediction-based pre-training objective for text summarization, for the state tracking model. We found that pre-training for the seemingly distant summarization task works surprisingly well for dialogue state tracking. In addition, we found that while recurrent state context representation works also reasonably well, the model may have a hard time recovering from earlier mistakes. We conducted experiments on the MultiWOZ 2.1-2.4, WOZ 2.0, and DSTC2 datasets with consistent observations.

artificial intelligence, computational linguistics, natural language, (19 more...)

arXiv.org Artificial Intelligence

2108.1399

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback