AITopics | Pellegrino, Nicholas

Collaborating Authors

Pellegrino, Nicholas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models

Croxford, Emma, Gao, Yanjun, Pellegrino, Nicholas, Wong, Karen K., Wills, Graham, First, Elliot, Schnier, Miranda, Burton, Kyle, Ebby, Cris G., Gorskic, Jillian, Kalscheur, Matthew, Khalil, Samy, Pisani, Marie, Rubeor, Tyler, Stetson, Peter, Liao, Frank, Goswami, Cherodeep, Patterson, Brian, Afshar, Majid

arXiv.org Artificial IntelligenceJan-15-2025

As Large Language Models (LLMs) are integrated into electronic health record (EHR) workflows, validated instruments are essential to evaluate their performance before implementation. Existing instruments for provider documentation quality are often unsuitable for the complexities of LLM-generated text and lack validation on real-world data. The Provider Documentation Summarization Quality Instrument (PDSQI-9) was developed to evaluate LLM-generated clinical summaries. Multi-document summaries were generated from real-world EHR data across multiple specialties using several LLMs (GPT-4o, Mixtral 8x7b, and Llama 3-8b). Validation included Pearson correlation for substantive validity, factor analysis and Cronbach's alpha for structural validity, inter-rater reliability (ICC and Krippendorff's alpha) for generalizability, a semi-Delphi process for content validity, and comparisons of high- versus low-quality summaries for discriminant validity. Seven physician raters evaluated 779 summaries and answered 8,329 questions, achieving over 80% power for inter-rater reliability. The PDSQI-9 demonstrated strong internal consistency (Cronbach's alpha = 0.879; 95% CI: 0.867-0.891) and high inter-rater reliability (ICC = 0.867; 95% CI: 0.867-0.868), supporting structural validity and generalizability. Factor analysis identified a 4-factor model explaining 58% of the variance, representing organization, clarity, accuracy, and utility. Substantive validity was supported by correlations between note length and scores for Succinct (rho = -0.200, p = 0.029) and Organized (rho = -0.190, p = 0.037). Discriminant validity distinguished high- from low-quality summaries (p < 0.001). The PDSQI-9 demonstrates robust construct validity, supporting its use in clinical practice to evaluate LLM-generated summaries and facilitate safer integration of LLMs into healthcare workflows.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.08977

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.69)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review

Croxford, Emma, Gao, Yanjun, Pellegrino, Nicholas, Wong, Karen K., Wills, Graham, First, Elliot, Liao, Frank J., Goswami, Cherodeep, Patterson, Brian, Afshar, Majid

arXiv.org Artificial IntelligenceSep-26-2024

Large Language Models have advanced clinical Natural Language Generation, creating opportunities to manage the volume of medical text. However, the high-stakes nature of medicine requires reliable evaluation, which remains a challenge. In this narrative review, we assess the current evaluation state for clinical summarization tasks and propose future directions to address the resource constraints of expert human evaluation.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2409.1817

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Massachusetts (0.28)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.46)
Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity

Gharaee, Zahra, Lowe, Scott C., Gong, ZeMing, Arias, Pablo Millan, Pellegrino, Nicholas, Wang, Austin T., Haurum, Joakim Bruslund, Zarubiieva, Iuliia, Kari, Lila, Steinke, Dirk, Taylor, Graham W., Fieguth, Paul, Chang, Angel X.

arXiv.org Artificial IntelligenceJun-24-2024

As part of an ongoing worldwide effort to comprehend and monitor insect biodiversity, this paper presents the BIOSCAN-5M Insect dataset to the machine learning community and establish several benchmark tasks. BIOSCAN-5M is a comprehensive dataset containing multi-modal information for over 5 million insect specimens, and it significantly expands existing image-based biological datasets by including taxonomic labels, raw nucleotide barcode sequences, assigned barcode index numbers, and geographical information. We propose three benchmark experiments to demonstrate the impact of the multi-modal data types on the classification and clustering accuracy. First, we pretrain a masked language model on the DNA barcode sequences of the BIOSCAN-5M dataset, and demonstrate the impact of using this large reference library on species- and genus-level classification performance. Second, we propose a zero-shot transfer learning task applied to images and DNA barcodes to cluster feature embeddings obtained from self-supervised learning, to investigate whether meaningful clusters can be derived from these representation embeddings. Third, we benchmark multi-modality by performing contrastive learning on DNA barcodes, image data, and taxonomic information. This yields a general shared embedding space enabling taxonomic classification using multiple types of information and modalities. The code repository of the BIOSCAN-5M Insect dataset is available at https://github.com/zahrag/BIOSCAN-5M.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.12723

Country:

South America (1.00)
Europe (1.00)
Asia (1.00)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset

Gharaee, Zahra, Gong, ZeMing, Pellegrino, Nicholas, Zarubiieva, Iuliia, Haurum, Joakim Bruslund, Lowe, Scott C., McKeown, Jaclyn T. A., Ho, Chris C. Y., McLeod, Joschka, Wei, Yi-Yun C, Agda, Jireh, Ratnasingham, Sujeevan, Steinke, Dirk, Chang, Angel X., Taylor, Graham W., Fieguth, Paul

arXiv.org Artificial IntelligenceNov-13-2023

In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-1M Insect Dataset. Each record is taxonomically classified by an expert, and also has associated genetic information including raw nucleotide barcode sequences and assigned barcode index numbers, which are genetically-based proxies for species classification. This paper presents a curated million-image dataset, primarily to train computer-vision models capable of providing image-based taxonomic assessment, however, the dataset also presents compelling characteristics, the study of which would be of interest to the broader machine learning community. Driven by the biological nature inherent to the dataset, a characteristic long-tailed class-imbalance distribution is exhibited. Furthermore, taxonomic labelling is a hierarchical classification scheme, presenting a highly fine-grained classification problem at lower levels. Beyond spurring interest in biodiversity research within the machine learning community, progress on creating an image-based taxonomic classifier will also further the ultimate goal of all BIOSCAN research: to lay the foundation for a comprehensive survey of global biodiversity. This paper introduces the dataset and explores the classification task through the implementation and analysis of a baseline classifier.

artificial intelligence, machine learning, survey article, (16 more...)

arXiv.org Artificial Intelligence

2307.10455

Country: North America > Canada (0.68)

Genre:

Research Report (1.00)
Overview (0.87)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback