native language
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > China > Beijing > Beijing (0.04)
The power of sound in a virtual world
In the digital age, sound is proving to be the greatest connector of all, says Erik Vaveris, vice president of product management and CMO at Shure, and Brian Scholl, director of the Perception and Cognition Laboratory at Yale University. In an era where business, education, and even casual conversations occur via screens, sound has become a differentiating factor. We obsess over lighting, camera angles, and virtual backgrounds, but how we sound can be just as critical to credibility, trust, and connection. Both see audio as more than a technical layer: It's a human factor shaping how people perceive intelligence, trustworthiness, and authority in virtual settings. If you're willing to take a little bit of time with your audio set up, you can really get across the full power of your message and the full power of who you are to your peers, to your employees, your boss, your suppliers, and of course, your customers, says Vaveris. Scholl's research shows that poor audio quality can make a speaker seem less persuasive, less hireable, and even less credible. We know that [poor] sound doesn't reflect the people themselves, but we really just can't stop ourselves from having those impressions, says Scholl. We all understand intuitively that if we're having difficulty being understood while we're talking, then that's bad. But we sort of think that as long as you can make out the words I'm saying, then that's probably all fine. And this research showed in a somewhat surprising way, to a surprising degree, that this is not so. For organizations navigating hybrid work, training, and marketing, the stakes have become high. Vaveris points out that the pandemic was a watershed moment for audio technology. As classrooms, boardrooms, and conferences shifted online almost overnight, demand accelerated for advanced noise suppression, echo cancellation, and AI-driven processing tools that make meetings more seamless. Today, machine learning algorithms can strip away keyboard clicks or reverberation and isolate a speaker's voice in noisy environments. That clarity underpins the accuracy of AI meeting assistants that can step in to transcribe, summarize, and analyze discussions. The implications across industries are rippling. It empowers executives and creators alike to produce broadcast-quality content from the comfort of their home office. And it offers companies new ways to build credibility with customers and employees without the costly overhead of traditional production.
- North America > United States > Massachusetts (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England (0.04)
- (2 more...)
- Health & Medicine (0.68)
- Marketing (0.46)
- Education > Educational Setting (0.46)
CodeVaani: A Multilingual, Voice-Based Code Learning Assistant
Havare, Jayant, Tamilselvam, Srikanth, Mittal, Ashish, Thorat, Shalaka, Jadia, Soham, Apte, Varsha, Ramakrishnan, Ganesh
Programming education often assumes English proficiency and text-based interaction, creating barriers for students from multilingual regions such as India. We present CodeVaani, a multilingual speech-driven assistant for understanding code, built into Bodhitree [1], a Learning Management System developed at IIT Bombay. It is a voice-enabled assistant that helps learners explore programming concepts in their native languages. The system integrates Indic ASR, a codeaware transcription refinement module, and a code model for generating relevant answers. Responses are provided in both text and audio for natural interaction. In a study with 28 beginner programmers, CodeVaani achieved 75% response accuracy, with over 80% of participants rating the experience positively. Compared to classroom assistance, our framework offers ondemand availability, scalability to support many learners, and multilingual support that lowers the entry barrier for students with limited English proficiency. The demo will illustrate these capabilities and highlight how voice-based AI systems can make programming education more inclusive. Supplementary artifacts and demo video are also made available.
- Research Report (0.50)
- Questionnaire & Opinion Survey (0.49)
AI and the End of Accents
I sound Korean--because I am Korean. Can AI make me sound American? It all began, as these things often do, with an Instagram ad . "No one tells you this if you're an immigrant, but accent discrimination is a real thing," said a woman in the video. Her own accent is faintly Eastern European--so subtle it took me a few playbacks to notice.
- Asia > China (0.16)
- North America > United States > Ohio (0.05)
- North America > United States > New York (0.05)
- (8 more...)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Communications > Social Media (0.71)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
ADAM: A Diverse Archive of Mankind for Evaluating and Enhancing LLMs in Biographical Reasoning
Cekinmez, Jasin, Ghahroodi, Omid, Chandle, Saad Fowad, Gupta, Dhiman, Asgari, Ehsaneddin
We introduce ADAM (A Diverse Archive of Mankind), a framework for evaluating and improving multimodal large language models (MLLMs) in biographical reasoning. To the best of our knowledge, this is the first work to systematically examine LLM capabilities in biography, a critical yet underexplored dimension of factual knowledge. At its core, AdamDB is a multilingual and multimodal dataset covering over 4 million individuals across geography, time, and profession, while AdamBench provides cognitively structured evaluations based on Bloom's taxonomy, spanning six reasoning levels in both English and native languages. To address hallucinations, particularly for lesser-known individuals, we propose AdamRAG, a retrieval-augmented generation system tailored to biographical contexts. Experiments show that AdamRAG substantially improves open-source models and modestly benefits closed-source ones, with the largest gains on lower-order reasoning. Popularity strongly mediates accuracy, and multimodal input via face images offers smaller, less consistent improvements than retrieval. ADAM establishes the first benchmark and framework for cognitively, culturally, and multimodally grounded biographical evaluation, advancing the development of multilingual, accurate, and hallucination-resistant MLLMs.
- North America > United States > Virginia (0.04)
- Asia > Middle East > Qatar (0.04)
- South America (0.04)
- (5 more...)
Not All Visitors are Bilingual: A Measurement Study of the Multilingual Web from an Accessibility Perspective
Bhuiyan, Masudul Hasan Masud, Varvello, Matteo, Zaki, Yasir, Staicu, Cristian-Alexandru
English is the predominant language on the web, powering nearly half of the world's top ten million websites. Support for multilingual content is nevertheless growing, with many websites increasingly combining English with regional or native languages in both visible content and hidden metadata. This multilingualism introduces significant barriers for users with visual impairments, as assistive technologies like screen readers frequently lack robust support for non-Latin scripts and misrender or mispronounce non-English text, compounding accessibility challenges across diverse linguistic contexts. Yet, large-scale studies of this issue have been limited by the lack of comprehensive datasets on multilingual web content. To address this gap, we introduce LangCrUX, the first large-scale dataset of 120,000 popular websites across 12 languages that primarily use non-Latin scripts. Leveraging this dataset, we conduct a systematic analysis of multilingual web accessibility and uncover widespread neglect of accessibility hints. We find that these hints often fail to reflect the language diversity of visible content, reducing the effectiveness of screen readers and limiting web accessibility. We finally propose Kizuki, a language-aware automated accessibility testing extension to account for the limited utility of language-inconsistent accessibility hints.
- Asia > Russia (0.29)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > India (0.07)
- (23 more...)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Asia > China > Beijing > Beijing (0.04)
Bridging the Gap with Retrieval-Augmented Generation: Making Prosthetic Device User Manuals Available in Marginalised Languages
Ogbonna, Ikechukwu, Davidson, Lesley, Banerjee, Soumya, Dasgupta, Abhishek, Kenney, Laurence, Nagaraja, Vikranth Harthikote
Millions of people in African countries face barriers to accessing healthcare due to language and literacy gaps. This research tackles this challenge by transforming complex medical documents -- in this case, prosthetic device user manuals -- into accessible formats for underserved populations. This case study in cross-cultural translation is particularly pertinent/relevant for communities that receive donated prosthetic devices but may not receive the accompanying user documentation. Or, if available online, may only be available in formats (e.g., language and readability) that are inaccessible to local populations (e.g., English-language, high resource settings/cultural context). The approach is demonstrated using the widely spoken Pidgin dialect, but our open-source framework has been designed to enable rapid and easy extension to other languages/dialects. This work presents an AI-powered framework designed to process and translate complex medical documents, e.g., user manuals for prosthetic devices, into marginalised languages. The system enables users -- such as healthcare workers or patients -- to upload English-language medical equipment manuals, pose questions in their native language, and receive accurate, localised answers in real time. Technically, the system integrates a Retrieval-Augmented Generation (RAG) pipeline for processing and semantic understanding of the uploaded manuals. It then employs advanced Natural Language Processing (NLP) models for generative question-answering and multilingual translation. Beyond simple translation, it ensures accessibility to device instructions, treatment protocols, and safety information, empowering patients and clinicians to make informed healthcare decisions.
- Africa > Nigeria (0.06)
- North America > United States (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (5 more...)
CASPER: A Large Scale Spontaneous Speech Dataset
Xiao, Cihan, Liang, Ruixing, Zhang, Xiangyu, Tiryaki, Mehmet Emre, Bae, Veronica, Shankar, Lavanya, Yang, Rong, Poon, Ethan, Dupoux, Emmanuel, Khudanpur, Sanjeev, Perera, Leibny Paola Garcia
The majority (67.79%) reported speaking US English, reflecting the dataset's primary demographic. However, a significant proportion of non-native and regionally influenced English varieties are also present, including Chinese Mandarin-influenced English (4.81%), UK English (5.29%), and Indian English (2.88%). Additionally, 14.42% of participants did not specify an accent, indicating either an omission or variability in self-identification. The participants' accent and native language are based on their self-identification, for example, the number of speakers with an Arabic accent may differ from the number with Arabic as their native language. Age distribution reveals that younger speakers are over-represented, with 57.21% of participants in the 18-29 age range and 23.56% in the 30-39 range.
CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention
Sun, Yuxi, Zuo, Aoqi, Gao, Wei, Ma, Jing
Large Language Models (LLMs) often exhibit knowledge disparities across languages. Encouraging LLMs to \textit{abstain} when faced with knowledge gaps is a promising strategy to reduce hallucinations in multilingual settings. Current abstention strategies for multilingual scenarios primarily rely on generating feedback in various languages using LLMs and performing self-reflection. However, these methods can be adversely impacted by inaccuracies and biases in the generated feedback. To address this, from a causal perspective, we introduce \textit{CausalAbstain}, a method that helps LLMs determine whether to utilize multiple generated feedback responses and how to identify the most useful ones. Extensive experiments demonstrate that \textit{CausalAbstain} effectively selects helpful feedback and enhances abstention decisions with interpretability in both native language (\textsc{Casual-native}) and multilingual (\textsc{Causal-multi}) settings, outperforming strong baselines on two benchmark datasets covering encyclopedic and commonsense knowledge QA tasks. Our code and data are open-sourced at https://github.com/peachch/CausalAbstain.
- Asia > Singapore (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (5 more...)