AITopics | Mueller, David

Collaborating Authors

Mueller, David

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence

Hager, Sophia, Mueller, David, Duh, Kevin, Andrews, Nicholas

arXiv.org Artificial IntelligenceMar-18-2025

As large language models (LLMs) are increasingly used for factual question-answering, it becomes more important for LLMs to have the capability to communicate the likelihood that their answer is correct. For these verbalized expressions of uncertainty to be meaningful, they should reflect the error rates at the expressed level of confidence. However, when prompted to express confidence, the error rates of current LLMs are inconsistent with their communicated confidences, highlighting the need for uncertainty quantification methods. Many prior methods calculate lexical uncertainty, estimating a model's confidence in the specific string it generated. In some cases, however, it may be more useful to estimate semantic uncertainty, or the model's confidence in the answer regardless of how it is verbalized. We propose a simple procedure, uncertainty distillation, to teach an LLM to verbalize calibrated semantic confidences. Using held-out data to map initial uncertainty estimates to meaningful probabilities, we create examples annotated with verbalized probabilities for supervised fine-tuning. We demonstrate our method yields verbalized confidences that correlate with observed error rates with a small fine-tuned language model as well as with larger instruction-tuned models, and find that our semantic uncertainty correlates well with lexical uncertainty on short answers.

accuracy, calibration, probability, (16 more...)

arXiv.org Artificial Intelligence

2503.14749

Country:

Europe > Germany (0.46)
Asia > Middle East > UAE (0.14)

Industry: Education > Curriculum > Subject-Specific Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Where does In-context Translation Happen in Large Language Models

Sia, Suzanna, Mueller, David, Duh, Kevin

arXiv.org Artificial IntelligenceMar-7-2024

Prior work on Self-supervised large language models have in-context MT has focused on prompt-engineering, treating demonstrated the ability to perform Machine GPT models as black boxes by focusing on which examples Translation (MT) via in-context learning, but little to provide in-context (Moslem et al., 2023). Agrawal et al. is known about where the model performs (2022) apply similarity-based retrieval to select in-context the task with respect to prompt instructions and examples, while Sia & Duh (2023) suggest a coherencebased demonstration examples. In this work, we attempt approach. However, these works apply surface level to characterize the region where large language interventions leaving the internal mechanism of MT in GPT models transition from in-context learners to translation models largely not understood.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.0451

Country:

Europe > Italy (0.14)
Asia > Middle East > UAE (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Do Text-to-Text Multi-Task Learners Suffer from Task Conflict?

Mueller, David, Andrews, Nicholas, Dredze, Mark

arXiv.org Artificial IntelligenceDec-13-2022

Traditional multi-task learning architectures train a single model across multiple tasks through a shared encoder followed by task-specific decoders. Learning these models often requires specialized training algorithms that address task-conflict in the shared parameter updates, which otherwise can lead to negative transfer. A new type of multi-task learning within NLP homogenizes multi-task architectures as a shared encoder and language model decoder, which does surprisingly well across a range of diverse tasks. Does this new architecture suffer from task-conflicts that require specialized training algorithms? We study how certain factors in the shift towards text-to-text models affects multi-task conflict and negative transfer, finding that both directional conflict and transfer are surprisingly constant across architectures.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.06645

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast---Choose Three

Reich, Steven, Mueller, David, Andrews, Nicholas

arXiv.org Machine LearningOct-13-2020

Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightforward or effective; furthermore, a held-out calibration set may not always be available. In this paper, we study ensemble distillation as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. We validate this framework on two tasks: named-entity recognition and machine translation. We find that, across both tasks, ensemble distillation produces models which retain much of, and occasionally improve upon, the performance and calibration benefits of ensembles, while only requiring a single model during test-time.

deep learning, ensemble, neural network, (19 more...)

arXiv.org Machine Learning

2010.06721

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback