AITopics | filler word

Collaborating Authors

filler word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Word Clouds as Common Voices: LLM-Assisted Visualization of Participant-Weighted Themes in Qualitative Interviews

Colonel, Joseph T., Lin, Baihan

arXiv.org Artificial IntelligenceAug-12-2025

Word clouds are a common way to summarize qualitative interviews, yet traditional frequency-based methods often fail in conversational contexts: they surface filler words, ignore paraphrase, and fragment semantically related ideas. This limits their usefulness in early-stage analysis, when researchers need fast, interpretable overviews of what participant actually said. We introduce ThemeClouds, an open-source visualization tool that uses large language models (LLMs) to generate thematic, participant-weighted word clouds from dialogue transcripts. The system prompts an LLM to identify concept-level themes across a corpus and then counts how many unique participants mention each topic, yielding a visualization grounded in breadth of mention rather than raw term frequency. Researchers can customize prompts and visualization parameters, providing transparency and control. Using interviews from a user study comparing five recording-device configurations (31 participants; 155 transcripts, Whisper ASR), our approach surfaces more actionable device concerns than frequency clouds and topic-modeling baselines (e.g., LDA, BERTopic). We discuss design trade-offs for integrating LLM assistance into qualitative workflows, implications for interpretability and researcher agency, and opportunities for interactive analyses such as per-condition contrasts (``diff clouds'').

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.07517

Genre:

Questionnaire & Opinion Survey (0.87)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Probing Experts' Perspectives on AI-Assisted Public Speaking Training

Fourati, Nesrine, Barkar, Alisa, Dragée, Marion, Danthon-Lefebvre, Liv, Chollet, Mathieu

arXiv.org Artificial IntelligenceJul-14-2025

Background: Public speaking is a vital professional skill, yet it remains a source of significant anxiety for many individuals. Traditional training relies heavily on expert coaching, but recent advances in AI has led to novel types of commercial automated public speaking feedback tools. However, most research has focused on prototypes rather than commercial applications, and little is known about how public speaking experts perceive these tools. Objectives: This study aims to evaluate expert opinions on the efficacy and design of commercial AI-based public speaking training tools and to propose guidelines for their improvement. Methods: The research involved 16 semi-structured interviews and 2 focus groups with public speaking experts. Participants discussed their views on current commercial tools, their potential integration into traditional coaching, and suggestions for enhancing these systems. Results and Conclusions: Experts acknowledged the value of AI tools in handling repetitive, technical aspects of training, allowing coaches to focus on higher-level skills. However they found key issues in current tools, emphasising the need for personalised, understandable, carefully selected feedback and clear instructional design. Overall, they supported a hybrid model combining traditional coaching with AI-supported exercises.

artificial intelligence, natural language, trainee, (19 more...)

arXiv.org Artificial Intelligence

2507.0793

Country:

Europe (0.68)
North America > United States (0.67)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)
Personal > Interview (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.67)

Add feedback

WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper

Akinrintoyo, Emmanuel, Abdelhalim, Nadine, Salomons, Nicole

arXiv.org Artificial IntelligenceMay-29-2025

Whisper fails to correctly transcribe dementia speech because persons with dementia (PwDs) often exhibit irregular speech patterns and disfluencies such as pauses, repetitions, and fragmented sentences. It was trained on standard speech and may have had little or no exposure to dementia-affected speech. However, correct transcription is vital for dementia speech for cost-effective diagnosis and the development of assistive technology. In this work, we fine-tune Whisper with the open-source dementia speech dataset (DementiaBank) and our in-house dataset to improve its word error rate (WER). The fine-tuning also includes filler words to ascertain the filler inclusion rate (FIR) and F1 score. The fine-tuned models significantly outperformed the off-the-shelf models. The medium-sized model achieved a WER of 0.24, outperforming previous work. Similarly, there was a notable generalisability to unseen data and speech patterns.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.21551

Genre: Research Report (0.51)

Industry: Health & Medicine > Therapeutic Area > Neurology > Dementia (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

LearnerVoice: A Dataset of Non-Native English Learners' Spontaneous Speech

Kim, Haechan, Myung, Junho, Kim, Seoyoung, Lee, Sungpah, Kang, Dongyeop, Kim, Juho

arXiv.org Artificial IntelligenceJul-5-2024

Prevalent ungrammatical expressions and disfluencies in spontaneous speech from second language (L2) learners pose unique challenges to Automatic Speech Recognition (ASR) systems. However, few datasets are tailored to L2 learner speech. We publicly release LearnerVoice, a dataset consisting of 50.04 hours of audio and transcriptions of L2 learners' spontaneous speech. Our linguistic analysis reveals that transcriptions in our dataset contain L2S (L2 learner's Spontaneous speech) features, consisting of ungrammatical expressions and disfluencies (e.g., filler words, word repetitions, self-repairs, false starts), significantly more than native speech datasets. Fine-tuning whisper-small.en with LearnerVoice achieves a WER of 10.26%, 44.2% lower than vanilla whisper-small.en. Furthermore, our qualitative analysis indicates that 54.2% of errors from the vanilla model on LearnerVoice are attributable to L2S features, with 48.1% of them being reduced in the fine-tuned model.

dataset, learnervoice, speech, (14 more...)

arXiv.org Artificial Intelligence

2407.0428

Country:

North America > United States > Minnesota (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills

Zheng, Qingxiao, Huang, Yun

arXiv.org Artificial IntelligenceOct-23-2023

This study explores the impact of AI-generated digital self-clones on improving online presentation skills. We carried out a mixed-design experiment involving 44 international students, comparing self-recorded videos (control) with self-clone videos (AI group) for English presentation practice. The AI videos utilized voice cloning, face swapping, lip-sync, and body-language simulation to refine participants' original presentations in terms of repetition, filler words, and pronunciation. Machine-rated scores indicated enhancements in speech performance for both groups. Though the groups didn't significantly differ, the AI group exhibited a heightened depth of reflection, self-compassion, and a meaningful transition from a corrective to an enhancive approach to self-critique. Within the AI group, congruence between self-perception and AI self-clones resulted in diminished speech anxiety and increased enjoyment. Our findings recommend the ethical employment of digital self-clones to enhance the emotional and cognitive facets of skill development.

participant, role model, video, (13 more...)

arXiv.org Artificial Intelligence

2310.15112

Country:

North America > United States > Illinois (0.04)
North America > United States > Florida > Hillsborough County > Tampa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.95)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Higher Education (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

InterviewBot: Real-Time End-to-End Dialogue System to Interview Students for College Admission

Wang, Zihao, Keyes, Nathan, Crawford, Terry, Choi, Jinho D.

arXiv.org Artificial IntelligenceSep-5-2023

We present the InterviewBot that dynamically integrates conversation history and customized topics into a coherent embedding space to conduct 10 mins hybrid-domain (open and closed) conversations with foreign students applying to U.S. colleges for assessing their academic and cultural readiness. To build a neural-based end-to-end dialogue model, 7,361 audio recordings of human-to-human interviews are automatically transcribed, where 440 are manually corrected for finetuning and evaluation. To overcome the input/output size limit of a transformer-based encoder-decoder model, two new methods are proposed, context attention and topic storing, allowing the model to make relevant and consistent interactions. Our final model is tested both statistically by comparing its responses to the interview data and dynamically by inviting professional interviewers and various students to interact with it in real-time, finding it highly satisfactory in fluency and context awareness.

interview, proceedings, utterance, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/info14080460

2303.15049

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada (0.04)
(23 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.69)

Industry: Education > Educational Setting > Higher Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification

Wagner, Laurin, Zusag, Mario, Bloder, Theresa

arXiv.org Artificial IntelligenceAug-2-2023

This paper presents a fully automated approach for identifying speech anomalies from voice recordings to aid in the assessment of speech impairments. By combining Connectionist Temporal Classification (CTC) and encoder-decoder-based automatic speech recognition models, we generate rich acoustic and clean transcripts. We then apply several natural language processing methods to extract features from these transcripts to produce prototypes of healthy speech. Basic distance measures from these prototypes serve as input features for standard machine learning classifiers, yielding human-level accuracy for the distinction between recordings of people with aphasia and a healthy control group. Furthermore, the most frequently occurring aphasia types can be distinguished with 90% accuracy. The pipeline is directly applicable to other diseases and languages, showing promise for robustly extracting diagnostic speech biomarkers.

machine learning, natural language, transcript, (18 more...)

arXiv.org Artificial Intelligence

2308.01327

Country:

Europe > Russia (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Austria (0.04)
Asia > Russia (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Transcription free filler word detection with Neural semi-CRFs

Zhu, Ge, Yan, Yujia, Caceres, Juan-Pablo, Duan, Zhiyao

arXiv.org Artificial IntelligenceMar-11-2023

Non-linguistic filler words, such as "uh" or "um", are prevalent in spontaneous speech and serve as indicators for expressing hesitation or uncertainty. Previous works for detecting certain non-linguistic filler words are highly dependent on transcriptions from a well-established commercial automatic speech recognition (ASR) system. However, certain ASR systems are not universally accessible from many aspects, e.g., budget, target languages, and computational power. In this work, we investigate filler word detection system that does not depend on ASR systems. We show that, by using the structured state space sequence model (S4) and neural semi-Markov conditional random fields (semi-CRFs), we achieve an absolute F1 improvement of 6.4% (segment level) and 3.1% (event level) on the PodcastFillers dataset. We also conduct a qualitative analysis on the detected results to analyze the limitations of our proposed system.

detection, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.06475

Country: North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

How Descript's generative AI makes video editing as easy as updating text

#artificialintelligenceNov-15-2022, 22:06:57 GMT

Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. A podcaster steps up to a mic to do a review of a new chicken nugget brand. As he begins talking and recording himself on his laptop, real-time speech-to-text transcribes his comments: "So these nuggets are, um, made from chicken, but they're made to um, um, um, um, emulate the taste of, like, like, non chicken nuggets." That doesn't sound very professional; on his screen, he strikes through those filler words -- and while he's at it, boosts the podcast's sound quality before publishing it for his audience. This is one use case for audio-video editing tool Descript, which today announced a significant product update and a $50 million series C round led by the OpenAI Startup Fund. "The whole concept of Descript -- editing video like a doc -- is only possible because of AI [artificial intelligence]," said Jay LeBoeuf, Descript's head of business and corporate development.

chandrasekaran, descript, video, (12 more...)

#artificialintelligence

Industry: Banking & Finance > Capital Markets (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Chorus.ai Launches Outcome-Based Analytics in CI

#artificialintelligenceSep-2-2020, 18:37:52 GMT

SAN FRANCISCO, Calif., Sept. 2, 2020 -- Chorus.ai, a Conversation Intelligence Platform for high growth Revenue teams, announced new advanced analytics capabilities in the platform that provide Revenue leaders with unparalleled insights into the most effective indication of revenue momentum: the health of their relationships with customers and what is being said in those interactions. The updated capabilities include a comprehensive collection of reports built on advanced AI that measures different aspects of customer interactions such as rep activity levels, conversation skills, sales skills, deal intelligence, and market intelligence. It also exclusively provides the ability to deeply drill down to listen to the specific moments behind the data. These advancements come after Chorus' $45m Series C, furthering the vision for a connected CI and the company's mission to help the enterprise bring their best to every interaction and the voice of the customer back to every decision. A connected CI platform weaves into an organization's systems and workflows to provide powerful data and insights both in the platform and directly to other applications where Sales & Customer Success reps and leaders already work, such as their CRM.

artificial intelligence, chorus, customer relationship management, (12 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.57)
North America > Canada > Ontario > Toronto (0.05)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)

Technology:

Information Technology > Artificial Intelligence (0.54)
Information Technology > Enterprise Applications > Customer Relationship Management (0.31)

Add feedback