AITopics | Au, Rhoda

Collaborating Authors

Au, Rhoda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Natural Language Processing Approach to Support Biomedical Data Harmonization: Leveraging Large Language Models

Li, Zexu, Prabhu, Suraj P., Popp, Zachary T., Jain, Shubhi S., Balakundi, Vijetha, Ang, Ting Fang Alvin, Au, Rhoda, Chen, Jinying

arXiv.org Artificial IntelligenceNov-4-2024

Biomedical research requires large, diverse samples to produce unbiased results. Automated methods for matching variables across datasets can accelerate this process. Research in this area has been limited, primarily focusing on lexical matching and ontology based semantic matching. We aimed to develop new methods, leveraging large language models (LLM) and ensemble learning, to automate variable matching. Methods: We utilized data from two GERAS cohort (European and Japan) studies to develop variable matching methods. We first manually created a dataset by matching 352 EU variables with 1322 candidate JP variables, where matched variable pairs were positive and unmatched pairs were negative instances. Using this dataset, we developed and evaluated two types of natural language processing (NLP) methods, which matched variables based on variable labels and definitions from data dictionaries: (1) LLM-based and (2) fuzzy matching. We then developed an ensemble-learning method, using the Random Forest model, to integrate individual NLP methods. RF was trained and evaluated on 50 trials. Each trial had a random split (4:1) of training and test sets, with the model's hyperparameters optimized through cross-validation on the training set. For each EU variable, 1322 candidate JP variables were ranked based on NLP-derived similarity scores or RF's probability scores, denoting their likelihood to match the EU variable. Ranking performance was measured by top-n hit ratio (HRn) and mean reciprocal rank (MRR). Results:E5 performed best among individual methods, achieving 0.90 HR-30 and 0.70 MRR. RF performed better than E5 on all metrics over 50 trials (P less than 0.001) and achieved an average HR 30 of 0.98 and MRR of 0.73. LLM-derived features contributed most to RF's performance. One major cause of errors in automatic variable matching was ambiguous variable definitions within data dictionaries.

derivation rule, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2411.0273

Country:

Europe (1.00)
Asia > Japan (0.25)
North America > United States (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.70)
Health & Medicine > Epidemiology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Deep ensemble learning for Alzheimers disease classification

An, Ning, Ding, Huitong, Yang, Jiaoyun, Au, Rhoda, Ang, Ting Fang Alvin

arXiv.org Machine LearningMay-29-2019

Ensemble learning use multiple algorithms to obtain better predictive performance than any single one of its constituent algorithms could. With growing popularity of deep learning, researchers have started to ensemble them for various purposes. Few if any, however, has used the deep learning approach as a means to ensemble algorithms. This paper presents a deep ensemble learning framework which aims to harness deep learning algorithms to integrate multisource data and tap the wisdom of experts. At the voting layer, a sparse autoencoder is trained for feature learning to reduce the correlation of attributes and diversify the base classifiers ultimately. At the stacking layer, a nonlinear feature-weighted method based on deep belief networks is proposed to rank the base classifiers which may violate the conditional independence. Neural network is used as meta classifier. At the optimizing layer, under-sampling and threshold-moving are used to cope with cost-sensitive problem. Optimized predictions are obtained based on ensemble of probabilistic predictions by similarity calculation. The proposed deep ensemble learning framework is used for Alzheimers disease classification. Experiments with the clinical dataset from national Alzheimers coordinating center demonstrate that the classification accuracy of our proposed framework is 4% better than 6 well-known ensemble approaches as well as the standard stacking algorithm. Adequate coverage of more accurate diagnostic services can be provided by utilizing the wisdom of averaged physicians. This paper points out a new way to boost the primary care of Alzheimers disease from the view of machine learning.

classifier, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1905.12827

Country: Asia > China (0.15)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Interpretable Machine Learning Models for the Digital Clock Drawing Test

Souillard-Mandar, William, Davis, Randall, Rudin, Cynthia, Au, Rhoda, Penney, Dana

arXiv.org Machine LearningJun-22-2016

The Clock Drawing Test (CDT) is a rapid, inexpensive, and popular neuropsychological screening tool for cognitive conditions. The Digital Clock Drawing Test (dCDT) uses novel software to analyze data from a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making possible the analysis of both the drawing process and final product. We developed methodology to analyze pen stroke data from these drawings, and computed a large collection of features which were then analyzed with a variety of machine learning techniques. The resulting scoring systems were designed to be more accurate than the systems currently used by clinicians, but just as interpretable and easy to use. The systems also allow us to quantify the tradeoff between accuracy and interpretability. We created automated versions of the CDT scoring systems currently used by clinicians, allowing us to benchmark our models, which indicated that our machine learning models substantially outperformed the existing scoring systems.

decision tree learning, interpretable machine learning model, neurology, (16 more...)

arXiv.org Machine Learning

1606.07163

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback