AITopics | Cruickshank, Iain

Collaborating Authors

Cruickshank, Iain

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM Chain Ensembles for Scalable and Accurate Data Annotation

Farr, David, Manzonelli, Nico, Cruickshank, Iain, Starbird, Kate, West, Jevin

arXiv.org Artificial IntelligenceNov-1-2024

Abstract--The ability of large language models (LLMs) to perform zero-shot classification makes them viable solutions for data annotation in rapidly evolving domains where quality labeled data is often scarce and costly to obtain. However, the large-scale deployment of LLMs can be prohibitively expensive. This paper introduces an LLM chain ensemble methodology that aligns multiple LLMs in a sequence, routing data subsets to subsequent models based on classification uncertainty. This approach leverages the strengths of individual LLMs within a broader system, allowing each model to handle data points where it exhibits the highest confidence, while forwarding more complex cases to potentially more robust models. Our results show that the chain ensemble method often exceeds the performance of the best individual model in the chain and achieves substantial cost savings, making LLM chain ensembles a practical and efficient solution for large-scale data annotation challenges.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.13006

Country: North America > United States > Washington > King County > Seattle (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

LLM Confidence Evaluation Measures in Zero-Shot CSS Classification

Farr, David, Cruickshank, Iain, Manzonelli, Nico, Clark, Nicholas, Starbird, Kate, West, Jevin

arXiv.org Artificial IntelligenceNov-1-2024

Assessing classification confidence is critical for leveraging large language models (LLMs) in automated labeling tasks, especially in the sensitive domains presented by Computational Social Science (CSS) tasks. In this paper, we make three key contributions: (1) we propose an uncertainty quantification (UQ) performance measure tailored for data annotation tasks, (2) we compare, for the first time, five different UQ strategies across three distinct LLMs and CSS data annotation tasks, (3) we introduce a novel UQ aggregation strategy that effectively identifies low-confidence LLM annotations and disproportionately uncovers data incorrectly labeled by the LLMs. Our results demonstrate that our proposed UQ aggregation strategy improves upon existing methods andcan be used to significantly improve human-in-the-loop data annotation processes.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.13047

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

DocNet: Semantic Structure in Inductive Bias Detection Models

Zhu, Jessica, Cruickshank, Iain, Cukier, Michel

arXiv.org Artificial IntelligenceJun-16-2024

News will have biases so long as people have opinions. However, as social media becomes the primary entry point for news and partisan gaps increase, it is increasingly important for informed citizens to be able to identify bias. People will be able to take action to avoid polarizing echo chambers if they know how the news they are consuming is biased. In this paper, we explore an often overlooked aspect of bias detection in documents: the semantic structure of news articles. We present DocNet, a novel, inductive, and low-resource document embedding and bias detection model that outperforms large language models. We also demonstrate that the semantic structure of news articles from opposing partisan sides, as represented in document-level graph embeddings, have significant similarities. These results can be used to advance bias detection in low-resource environments. Our code and data are made available at https://github.com/nlpresearchanon.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.10965

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Maffey, Katherine R., Dotterrer, Kyle, Niemann, Jennifer, Cruickshank, Iain, Lewis, Grace A., Kästner, Christian

arXiv.org Artificial IntelligenceMar-3-2023

Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.

artificial intelligence, machine learning, requirement, (17 more...)

arXiv.org Artificial Intelligence

2303.01998

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Genre:

Workflow (0.94)
Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine (0.94)
Government > Military (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Coordinating Narratives and the Capitol Riots on Parler

Ng, Lynnette Hui Xian, Cruickshank, Iain, Carley, Kathleen M.

arXiv.org Artificial IntelligenceSep-2-2021

Coordinated disinformation campaigns are used to influence social media users, potentially leading to offline violence. In this study, we introduce a general methodology to uncover coordinated messaging through analysis of user parleys on Parler. The proposed method constructs a user-to-user coordination network graph induced by a user-to-text graph and a text-to-text similarity graph. The text-to-text graph is constructed based on the textual similarity of Parler posts. We study three influential groups of users in the 6 January 2020 Capitol riots and detect networks of coordinated user clusters that are all posting similar textual content in support of different disinformation narratives related to the U.S. 2020 elections.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10588-022-09371-2

2109.00945

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Media (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback