AITopics | Wang, Xindi

Collaborating Authors

Wang, Xindi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Graph-tree Fusion Model with Bidirectional Information Propagation for Long Document Classification

Roy, Sudipta Singha, Wang, Xindi, Mercer, Robert E., Rudzicz, Frank

arXiv.org Artificial IntelligenceOct-3-2024

Long document classification presents challenges in capturing both local and global dependencies due to their extensive content and complex structure. Existing methods often struggle with token limits and fail to adequately model hierarchical relationships within documents. To address these constraints, we propose a novel model leveraging a graph-tree structure. Our approach integrates syntax trees for sentence encodings and document graphs for document encodings, which capture fine-grained syntactic relationships and broader document contexts, respectively. We use Tree Transformers to generate sentence encodings, while a graph attention network models inter- and intra-sentence dependencies. During training, we implement bidirectional information propagation from word-to-sentence-to-document and vice versa, which enriches the contextual representation. Our proposed method enables a comprehensive understanding of content at all hierarchical levels and effectively handles arbitrarily long contexts without token limit constraints. Experimental results demonstrate the effectiveness of our approach in all types of long document classification tasks.

classification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.0293

Country:

Europe (1.00)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)
North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Enhancing disease detection in radiology reports through fine-tuning lightweight LLM on weak labels

Wei, Yishu, Wang, Xindi, Ong, Hanley, Zhou, Yiliang, Flanders, Adam, Shih, George, Peng, Yifan

arXiv.org Artificial IntelligenceSep-24-2024

Despite significant progress in applying large language models (LLMs) to the medical domain, several limitations still prevent them from practical applications. Among these are the constraints on model size and the lack of cohort-specific labeled datasets. In this work, we investigated the potential of improving a lightweight LLM, such as Llama 3.1-8B, through fine-tuning with datasets using synthetic labels. Two tasks are jointly trained by combining their respective instruction datasets. When the quality of the task-specific synthetic labels is relatively high (e.g., generated by GPT4- o), Llama 3.1-8B achieves satisfactory performance on the open-ended disease detection task, with a micro F1 score of 0.91. Conversely, when the quality of the task-relevant synthetic labels is relatively low (e.g., from the MIMIC-CXR dataset), fine-tuned Llama 3.1-8B is able to surpass its noisy teacher labels (micro F1 score of 0.67 v.s. 0.63) when calibrated against curated labels, indicating the strong inherent underlying capability of the model. These findings demonstrate the potential of fine-tuning LLMs with synthetic labels, offering a promising direction for future research on LLM specialization in the medical domain.

large language model, llama 3, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.16563

Country: North America > United States (0.47)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.98)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation

Wang, Xindi, Mercer, Robert E., Rudzicz, Frank

arXiv.org Artificial IntelligenceMay-29-2024

The International Classification of Diseases (ICD) serves as a definitive medical classification system encompassing a wide range of diseases and conditions. The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record, which facilitates standardized documentation and management of various health conditions. Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection with a heavy long-tailed label distribution. In this paper, we leverage a multi-stage ``retrieve and re-rank'' framework as a novel solution to ICD indexing, via a hybrid discrete retrieval method, and re-rank retrieved candidates with contrastive learning that allows the model to make more accurate predictions from a simplified label space. The retrieval model is a hybrid of auxiliary knowledge of the electronic health records (EHR) and a discrete retrieval method (BM25), which efficiently collects high-quality candidates. In the last stage, we propose a label co-occurrence guided contrastive re-ranking model, which re-ranks the candidate labels by pulling together the clinical notes with positive ICD codes. Experimental results show the proposed method achieves state-of-the-art performance on a number of measures on the MIMIC-III benchmark.

icd code, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.19093

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Auxiliary Knowledge-Induced Learning for Automatic Multi-Label Medical Document Classification

Wang, Xindi, Mercer, Robert E., Rudzicz, Frank

arXiv.org Artificial IntelligenceMay-29-2024

The International Classification of Diseases (ICD) is an authoritative medical classification system of different diseases and conditions for clinical and management purposes. ICD indexing assigns a subset of ICD codes to a medical record. Since human coding is labour-intensive and error-prone, many studies employ machine learning to automate the coding process. ICD coding is a challenging task, as it needs to assign multiple codes to each medical document from an extremely large hierarchically organized collection. In this paper, we propose a novel approach for ICD indexing that adopts three ideas: (1) we use a multi-level deep dilated residual convolution encoder to aggregate the information from the clinical notes and learn document representations across different lengths of the texts; (2) we formalize the task of ICD classification with auxiliary knowledge of the medical records, which incorporates not only the clinical texts but also different clinical code terminologies and drug prescriptions for better inferring the ICD codes; and (3) we introduce a graph convolutional network to leverage the co-occurrence patterns among ICD codes, aiming to enhance the quality of label representations. Experimental results show the proposed method achieves state-of-the-art performance on a number of measures.

icd code, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.19084

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.66)
Instructional Material > Online (0.60)
Instructional Material > Course Syllabus & Notes (0.60)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models

Wang, Xindi, Salmani, Mahsa, Omidi, Parsa, Ren, Xiangyu, Rezagholizadeh, Mehdi, Eshaghi, Armaghan

arXiv.org Artificial IntelligenceFeb-3-2024

Recently, large language models (LLMs) have shown remarkable capabilities including understanding context, engaging in logical reasoning, and generating responses. However, this is achieved at the expense of stringent computational and memory requirements, hindering their ability to effectively support long input sequences. This survey provides an inclusive review of the recent techniques and methods devised to extend the sequence length in LLMs, thereby enhancing their capacity for long-context understanding. In particular, we review and categorize a wide range of techniques including architectural modifications, such as modified positional encoding and altered attention mechanisms, which are designed to enhance the processing of longer sequences while avoiding a proportional increase in computational requirements. The diverse methodologies investigated in this study can be leveraged across different phases of LLMs, i.e., training, fine-tuning and inference. This enables LLMs to efficiently process extended sequences. The limitations of the current methodologies is discussed in the last section along with the suggestions for future research directions, underscoring the importance of sequence length in the continued advancement of LLMs.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.02244

Country: North America > Canada (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

Wang, Xindi, Wang, Yufei, Xu, Can, Geng, Xiubo, Zhang, Bowen, Tao, Chongyang, Rudzicz, Frank, Mercer, Robert E., Jiang, Daxin

arXiv.org Artificial IntelligenceAug-1-2023

Large language models (LLMs) have shown remarkable However, despite the advantages of ICL, it is still unclear how ICL capacity for in-context learning (ICL), where learning a new task learns knowledge from the given prompts without updating its model from just a few training examples is done without being explicitly parameters. Preliminary research [1, 11] compared ICL with simple pre-trained. However, despite the success of LLMs, there has been machine learning models, such as logistic regression and shallow little understanding of how ICL learns the knowledge from the given neural networks. In this paper, we take a further step and investigate prompts. In this paper, to make progress toward understanding the learning behaviour differences between ICL and supervised learning learning behaviour of ICL, we train the same LLMs with the same (SL). Specifically, we train three LLMs with the same training data demonstration examples via ICL and supervised learning (SL), respectively, via in-context learning and supervised learning separately and analyze and investigate their performance under label perturbations their generated outputs. While SL is a well-established approach (i.e., noisy labels and label imbalance) on a range of classification that uses labelled data to train models to make accurate predictions, tasks. First, via extensive experiments, we find that gold labels ICL takes a different approach by leveraging the context of the text have significant impacts on the downstream in-context performance, to learn from unlabeled data in order to improve the accuracy of the especially for large language models; however, imbalanced predictions. By comparing the performance of ICL and SL, we gain labels matter little to ICL across all model sizes.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.15411

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP

Chen, John, Berlot-Atwell, Ian, Hossain, Safwan, Wang, Xindi, Rudzicz, Frank

arXiv.org Artificial IntelligenceNov-18-2020

Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as freetext. We propose a novel task of exploring fairness on a multimodal clinical dataset, adopting equalized odds for the downstream medical prediction tasks. To this end, we investigate a modality-agnostic fairness algorithm - equalized odds post processing - and compare it to a text-specific fairness algorithm: debiased clinical word embeddings. Despite the fact that debiased word embeddings do not explicitly address equalized odds of protected groups, we show that a text-specific approach to fairness may simultaneously achieve a good balance of performance and classical notions of fairness. We hope that our paper inspires future contributions at the critical intersection of clinical NLP and fairness. The full source code is available here: https://github.com/johntiger1/multimodal_fairness

deep learning, fairness, vascular disease, (22 more...)

arXiv.org Artificial Intelligence

2011.09625

Country:

North America > United States (1.00)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

L2P: An Algorithm for Estimating Heavy-tailed Outcomes

Wang, Xindi, Varol, Onur, Eliassi-Rad, Tina

arXiv.org Machine LearningAug-13-2019

Many real-world prediction tasks have outcome (a.k.a.~target or response) variables that have characteristic heavy-tail distributions. Examples include copies of books sold, auction prices of art pieces, etc. By learning heavy-tailed distributions, ``big and rare'' instances (e.g., the best-sellers) will have accurate predictions. Most existing approaches are not dedicated to learning heavy-tailed distribution; thus, they heavily under-predict such instances. To tackle this problem, we introduce \emph{Learning to Place} (\texttt{L2P}), which exploits the pairwise relationships between instances to learn from a proportionally higher number of rare instances. \texttt{L2P} consists of two stages. In Stage 1, \texttt{L2P} learns a pairwise preference classifier: \textit{is instance A $>$ instance B?}. In Stage 2, \texttt{L2P} learns to place a new instance into an ordinal ranking of known instances. Based on its placement, the new instance is then assigned a value for its outcome variable. Experiments on real data show that \texttt{L2P} outperforms competing approaches in terms of accuracy and capability to reproduce heavy-tailed outcome distribution. In addition, \texttt{L2P} can provide an interpretable model with explainable outcomes by placing each predicted instance in context with its comparable neighbors.

dataset, law enforcement, neural network, (23 more...)

arXiv.org Machine Learning

1908.04628

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback