AITopics | Tomanek, Katrin

Collaborating Authors

Tomanek, Katrin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Have LLMs Made Active Learning Obsolete? Surveying the NLP Community

Romberg, Julia, Schröder, Christopher, Gonsior, Julius, Tomanek, Katrin, Olsson, Fredrik

arXiv.org Artificial IntelligenceMar-12-2025

Supervised learning relies on annotated data, which is expensive to obtain. A longstanding strategy to reduce annotation costs is active learning, an iterative process, in which a human annotates only data instances deemed informative by a model. Large language models (LLMs) have pushed the effectiveness of active learning, but have also improved methods such as few- or zero-shot learning, and text synthesis - thereby introducing potential alternatives. This raises the question: has active learning become obsolete? To answer this fully, we must look beyond literature to practical experiences. We conduct an online survey in the NLP community to collect previously intangible insights on the perceived relevance of data annotation, particularly focusing on active learning, including best practices, obstacles and expected future developments. Our findings show that annotated data remains a key factor, and active learning continues to be relevant. While the majority of active learning users find it effective, a comparison with a community survey from over a decade ago reveals persistent challenges: setup complexity, estimation of cost reduction, and tooling. We publish an anonymized version of the collected dataset

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.09701

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Education (0.68)
Information Technology > Security & Privacy (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Speech Recognition With LLMs Adapted to Disordered Speech Using Reinforcement Learning

Nagpal, Chirag, Venugopalan, Subhashini, Tobin, Jimmy, Ladewig, Marilyn, Heller, Katherine, Tomanek, Katrin

arXiv.org Artificial IntelligenceDec-24-2024

We introduce a large language model (LLM) capable of processing speech inputs and show that tuning it further with reinforcement learning on human preference (RLHF) enables it to adapt better to disordered speech than traditional fine-tuning. Our method replaces low-frequency text tokens in an LLM's vocabulary with audio tokens and enables the model to recognize speech by fine-tuning it on speech with transcripts. We then use RL with rewards based on syntactic and semantic accuracy measures generalizing the LLM further to recognize disordered speech. While the resulting LLM does not outperform existing systems for speech recognition, we find that tuning with reinforcement learning using custom rewards leads to substantially better performance than supervised fine-tuning of the language model, specifically when adapting to speech in a different setting. This presents a compelling alternative tuning strategy for speech recognition using large language models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.00039

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics

Chang, Tyler A., Tomanek, Katrin, Hoffmann, Jessica, Thain, Nithum, van Liemt, Erin, Meier-Hellstern, Kathleen, Dixon, Lucas

arXiv.org Artificial IntelligenceMar-13-2024

We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the given perspectives. As a starting point, we use a deterministic retrieval system and then focus on common LLM failure modes that arise during this approach to text generation, namely hallucination and coverage errors. We propose and evaluate three methods to detect such errors based on (1) word-overlap, (2) salience, and (3) LLM-based classifiers. Our results demonstrate that LLM-based classifiers, even when trained only on synthetic errors, achieve high error detection performance, with ROC AUC scores of 95.3% for hallucination and 90.5% for coverage error detection on unambiguous error cases. We show that when no training data is available, our other methods still yield good results on hallucination (84.0%) and coverage error (85.2%) detection.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2403.08904

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Government (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion

Tomanek, Katrin, Cai, Shanqing, Venugopalan, Subhashini

arXiv.org Artificial IntelligenceDec-21-2023

Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior conversations to enhance the relevance of predictions, particularly when the user data is small (~1000 samples). Specifically, we compare fine-tuning, prompt-tuning, and retrieval augmented generation of expanded text suggestions for abbreviated inputs. Our case study with a deployed 8B parameter LLM on a real user living with ALS, and experiments on movie character personalization indicates that (1) customization may be necessary in some scenarios and prompt-tuning generalizes well to those, (2) fine-tuning on in-domain data (with as few as 600 samples) still shows some gains, however (3) retrieval augmented few-shot selection also outperforms fine-tuning. (4) Parameter efficient tuning allows for efficient and scalable personalization. For prompt-tuning, we also find that initializing the learned "soft-prompts" to user relevant concept tokens leads to higher accuracy than random initialization.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.14327

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

Cai, Shanqing, Venugopalan, Subhashini, Seaver, Katie, Xiao, Xiang, Tomanek, Katrin, Jalasutram, Sri, Morris, Meredith Ringel, Kane, Shaun, Narayanan, Ajit, MacDonald, Robert L., Kornman, Emily, Vance, Daniel, Casey, Blair, Gleason, Steve M., Nelson, Philip Q., Brenner, Michael P.

arXiv.org Artificial IntelligenceDec-3-2023

Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking keyboards is important for improving the quality of life for such individuals. Recent advances in neural networks of natural language pose new opportunities for re-thinking strategies and user interfaces for enhanced text-entry for AAC users. In this paper, we present SpeakFaster, consisting of large language models (LLMs) and a co-designed user interface for text entry in a highly-abbreviated form, allowing saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study with 19 non-AAC participants typing on a mobile device by hand demonstrated gains in motor savings in line with the offline simulation, while introducing relatively small effects on overall typing speed. Lab and field testing on two eye-gaze typing users with amyotrophic lateral sclerosis (ALS) demonstrated text-entry rates 29-60% faster than traditional baselines, due to significant saving of expensive keystrokes achieved through phrase and word predictions from context-aware LLMs. These findings provide a strong foundation for further exploration of substantially-accelerated text communication for motor-impaired users and demonstrate a direction for applying LLMs to text-based user interfaces.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.01532

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Towards Agile Text Classifiers for Everyone

Mozes, Maximilian, Hoffmann, Jessica, Tomanek, Katrin, Kouate, Muhamed, Thain, Nithum, Yuan, Ann, Bolukbasi, Tolga, Dixon, Lucas

arXiv.org Artificial IntelligenceOct-21-2023

Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text classification, whereby classifiers are trained using small, targeted datasets that can be quickly developed for a particular policy. Experimenting with 7 datasets from three safety-related domains, comprising 15 annotation schemes, led to our key finding: prompt-tuning large language models, like PaLM 62B, with a labeled dataset of as few as 80 examples can achieve state-of-the-art performance. We argue that this enables a paradigm shift for text classification, especially for models supporting safer online discourse. Instead of collecting millions of examples to attempt to create universal safety classifiers over months or years, classifiers could be tuned using small datasets, created by individuals or small organizations, tailored for specific use cases, and iterated on and adapted in the time-span of a day.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.06541

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

An analysis of degenerating speech due to progressive dysarthria on ASR performance

Tomanek, Katrin, Seaver, Katie, Jiang, Pan-Pan, Cave, Richard, Harrel, Lauren, Green, Jordan R.

arXiv.org Artificial IntelligenceOct-31-2022

Although personalized automatic speech recognition (ASR) models have recently been designed to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition throughout disease progression. Speech was recorded by four individuals with degrading speech due to amyotrophic lateral sclerosis (ALS). Word error rates (WER) across recording sessions were computed for three ASR models: Unadapted Speaker Independent (U-SI), Adapted Speaker Independent (A-SI), and Adapted Speaker Dependent (A-SD or personalized). The performance of all three models degraded significantly over time as speech became more impaired, but the performance of the A-SD model improved markedly when it was updated with recordings from the severe stages of speech progression. Recording additional utterances early in the disease before speech degraded significantly did not improve the performance of A-SD models. Overall, our findings emphasize the importance of continuous recording (and model retraining) when providing personalized models for individuals with progressive speech impairments.

artificial intelligence, machine learning, speech, (14 more...)

arXiv.org Artificial Intelligence

2211.00089

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.36)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Shen, Jonathan, Nguyen, Patrick, Wu, Yonghui, Chen, Zhifeng, Chen, Mia X., Jia, Ye, Kannan, Anjuli, Sainath, Tara, Cao, Yuan, Chiu, Chung-Cheng, He, Yanzhang, Chorowski, Jan, Hinsu, Smit, Laurenzo, Stella, Qin, James, Firat, Orhan, Macherey, Wolfgang, Gupta, Suyog, Bapna, Ankur, Zhang, Shuyuan, Pang, Ruoming, Weiss, Ron J., Prabhavalkar, Rohit, Liang, Qiao, Jacob, Benoit, Liang, Bowen, Lee, HyoukJoong, Chelba, Ciprian, Jean, Sébastien, Li, Bo, Johnson, Melvin, Anil, Rohan, Tibrewal, Rajat, Liu, Xiaobing, Eriguchi, Akiko, Jaitly, Navdeep, Ari, Naveen, Cherry, Colin, Haghani, Parisa, Good, Otavio, Cheng, Youlong, Alvarez, Raziel, Caswell, Isaac, Hsu, Wei-Ning, Yang, Zongheng, Wang, Kuan-Chieh, Gonina, Ekaterina, Tomanek, Katrin, Vanik, Ben, Wu, Zelin, Jones, Llion, Schuster, Mike, Huang, Yanping, Chen, Dehao, Irie, Kazuki, Foster, George, Richardson, John, Macherey, Klaus, Bruguier, Antoine, Zen, Heiga, Raffel, Colin, Kumar, Shankar, Rao, Kanishka, Rybach, David, Murray, Matthew, Peddinti, Vijayaditya, Krikun, Maxim, Bacchiani, Michiel A. U., Jablin, Thomas B., Suderman, Rob, Williams, Ian, Lee, Benjamin, Bhatia, Deepti, Carlson, Justin, Yavuz, Semih, Zhang, Yu, McGraw, Ian, Galkin, Max, Ge, Qi, Pundak, Golan, Whipkey, Chad, Wang, Todd, Alon, Uri, Lepikhin, Dmitry, Tian, Ye, Sabour, Sara, Chan, William, Toshniwal, Shubham, Liao, Baohua, Nirschl, Michael, Rondon, Pat

arXiv.org Machine LearningFeb-21-2019

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.

deep learning, neural network, param, (20 more...)

arXiv.org Machine Learning

1902.08295

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback