AITopics | Carton, Samuel

Collaborating Authors

Carton, Samuel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Retrieving Versus Understanding Extractive Evidence in Few-Shot Learning

Elbakian, Karl, Carton, Samuel

arXiv.org Artificial IntelligenceFeb-19-2025

A key aspect of alignment is the proper use of within-document evidence to construct document-level decisions. We analyze the relationship between the retrieval and interpretation of within-document evidence for large language models in a few-shot setting. Specifically, we measure the extent to which model prediction errors are associated with evidence retrieval errors with respect to gold-standard human-annotated extractive evidence for five datasets, using two popular closed proprietary models. We perform two ablation studies to investigate when both label prediction and evidence retrieval errors can be attributed to qualities of the relevant evidence. We find that there is a strong empirical relationship between model prediction and evidence retrieval error, but that evidence retrieval error is mostly not associated with evidence interpretation error-a hopeful sign for downstream applications built on this mechanism.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.14095

Country:

North America > United States > Louisiana (0.14)
North America > Mexico > Mexico City (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

Ghosh, Satanu, Brodnik, Neal R., Frey, Carolina, Holgate, Collin, Pollock, Tresa M., Daly, Samantha, Carton, Samuel

arXiv.org Artificial IntelligenceJun-8-2024

We explore the ability of GPT-4 to perform ad-hoc schema based information extraction from scientific literature. We assess specifically whether it can, with a basic prompting approach, replicate two existing material science datasets, given the manuscripts from which they were originally manually extracted. We employ materials scientists to perform a detailed manual error analysis to assess where the model struggles to faithfully extract the desired information, and draw on their insights to suggest research directions to address this broadly important task.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.05348

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning to Ignore Adversarial Attacks

Zhang, Yiming, Zhou, Yangqiaoyu, Carton, Samuel, Tan, Chenhao

arXiv.org Artificial IntelligenceFeb-20-2023

Despite the strong performance of current NLP models, they can be brittle against adversarial attacks. To enable effective learning against adversarial inputs, we introduce the use of rationale models that can explicitly learn to ignore attack tokens. We find that the rationale models can successfully ignore over 90% of attack tokens. This approach leads to consistent sizable improvements ($\sim$10%) over baseline models in robustness on three datasets for both BERT and RoBERTa, and also reliably outperforms data augmentation with adversarial examples alone. In many cases, we find that our method is able to close the gap between model performance on a clean test set and an attacked test set and hence reduce the effect of adversarial attacks.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2205.11551

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Louisiana (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Evaluating and Characterizing Human Rationales

Carton, Samuel, Rathore, Anirudh, Tan, Chenhao

arXiv.org Artificial IntelligenceOct-9-2020

Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rationales fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using "fidelity curves" to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.

deep learning, neural network, rationale, (21 more...)

arXiv.org Artificial Intelligence

2010.04736

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.69)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

An Empirical Study on Explainable Prediction of Text Complexity: Preliminaries for Text Simplification

Garbacea, Cristina, Guo, Mengtian, Carton, Samuel, Mei, Qiaozhu

arXiv.org Artificial IntelligenceJul-30-2020

Text simplification is concerned with reducing the language complexity and improving the readability of professional content so that the text is accessible to readers at different ages and educational levels. As a promising practice to improve the fairness and transparency of text information systems, the notion of text simplification has been mixed in existing literature, ranging all the way through assessing the complexity of single words to automatically generating simplified documents. We show that the general problem of text simplification can be formally decomposed into a compact pipeline of tasks to ensure the transparency and explanability of the process. In this paper, we present a systematic analysis of the first two steps in this pipeline: 1) predicting the complexity of a given piece of text, and 2) identifying complex components from the text considered to be complex. We show that these two tasks can be solved separately, using either lexical approaches or the state-of-the-art deep learning methods, or they can be solved jointly through an end-to-end, explainable machine learning predictor. We propose formal evaluation metrics for both tasks, through which we are able to compare the performance of the candidate approaches using multiple datasets from a diversity of domains.

deep learning, neural network, simplification, (19 more...)

arXiv.org Artificial Intelligence

2007.15823

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Judge the Judges: A Large-Scale Evaluation Study of Neural Language Models for Online Review Generation

Garbacea, Cristina, Carton, Samuel, Yan, Shiyan, Mei, Qiaozhu

arXiv.org Machine LearningJan-2-2019

Recent advances in deep learning have resulted in a resurgence in the popularity of natural language generation (NLG). Many deep learning based models, including recurrent neural networks and generative adversarial networks, have been proposed and applied to generating various types of text. Despite the fast development of methods, how to better evaluate the quality of these natural language generators remains a significant challenge. We conduct an in-depth empirical study to evaluate the existing evaluation methods for natural language generation. We compare human-based evaluators with a variety of automated evaluation procedures, including discriminative evaluators that measure how well the generated text can be distinguished from human-written text, as well as text overlap metrics that measure how similar the generated text is to human-written references. We measure to what extent these different evaluators agree on the ranking of a dozen of state-of-the-art generators for online product reviews. We find that human evaluators do not correlate well with discriminative evaluators, leaving a bigger question of whether adversarial accuracy is the correct objective for natural language generation. In general, distinguishing machine-generated text is a challenging task even for human evaluators, and their decisions tend to correlate better with text overlap metrics. We also find that diversity is an intriguing metric that is indicative of the assessments of different evaluators.

deep learning, educational technology, evaluator, (19 more...)

arXiv.org Machine Learning

1901.00398

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts

Carton, Samuel, Mei, Qiaozhu, Resnick, Paul

arXiv.org Machine LearningAug-31-2018

We introduce an adversarial method for producing high-recall explanations of neural text classifier decisions. Building on an existing architecture for extractive explanations via hard attention, we add an adversarial layer which scans the residual of the attention for remaining predictive signal. Motivated by the important domain of detecting personal attacks in social media comments, we additionally demonstrate the importance of manually setting a semantically appropriate `default' behavior for the model by explicitly manipulating its bias term. We develop a validation set of human-annotated personal attacks to evaluate the impact of these changes.

deep learning, explanation, neural network, (24 more...)

arXiv.org Machine Learning

1809.01499

Country:

North America > United States (0.14)
Europe > Denmark (0.14)

Genre: Research Report (1.00)

Industry: Law > Civil Rights & Constitutional Law (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback