voice assistant
Amazon Alexa Is Now Available to Everyone. Here's How to Turn It Off (2026)
Alexa+ has been rolling out to everyone with a Prime membership, even if you didn't ask for it. Here's how to change it back. If Alexa's in your home, you might've been one of many users this month who were suddenly moved from the original Alexa to the new AI-powered Alexa+ voice assistant . Amazon announced in early January during CES that it'd be rolling out the new assistant to all Alexa+ Early Access customers, and that turns out to also include all Prime members, even if you weren't on the Early Access list. Alexa+ is still in Early Access, as it has been since it launched in spring last year, meaning that the assistant isn't fully complete, nor is it requiring you to pay the $20 monthly fee if you don't have Prime.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
Scarlett Johansson and Cate Blanchett back campaign accusing AI firms of theft
Johansson was dragged into the AI debate after OpenAI's voice assistant used her vocal likeness, prompting the actor to say she was'angered' by the move. Johansson was dragged into the AI debate after OpenAI's voice assistant used her vocal likeness, prompting the actor to say she was'angered' by the move. Scarlett Johansson, Cate Blanchett, REM and Jodi Picoult are among hundreds of Hollywood stars, musicians and authors backing a new campaign accusing AI companies of "theft" of their work. The "Stealing Isn't Innovation" drive launched on Thursday with the support of approximately 800 creative professionals and bands. It adds: "Artists, writers, and creators of all kinds are banding together with a simple message: Stealing our work is not innovation.
- North America > United States (0.33)
- Europe > United Kingdom (0.17)
- Europe > Ukraine (0.07)
- Oceania > Australia (0.05)
- Leisure & Entertainment > Sports (0.73)
- Media > Film (0.56)
- Government > Regional Government (0.53)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.79)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)
I Ditched Alexa and Upgraded My Smart Home
Here's how I cut down my family's reliance on Alexa. Until recently, my smart home setup was in chaos. After years of testing, buying, and upgrading to the latest smart home gadgets in an attempt to make my life easier, it became a bloated mess that was actually making it more complicated. My Alexa, Google Home, and Apple Home apps were awash with dead devices, duplicates, and automations that simply didn't work. My Hue Bridge, trying desperately to tie it all together, was creaking at the seams.
- Asia > Nepal (0.14)
- North America > United States > California (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
Gen AI in Automotive: Applications, Challenges, and Opportunities with a Case study on In-Vehicle Experience
Shinde, Chaitanya, Garikapati, Divya
Generative Artificial Intelligence is emerging as a transformative force in the automotive industry, enabling novel applications across vehicle design, manufacturing, autonomous driving, predictive maintenance, and in vehicle user experience. This paper provides a comprehensive review of the current state of GenAI in automotive, highlighting enabling technologies such as Generative Adversarial Networks and Variational Autoencoders. Key opportunities include accelerating autonomous driving validation through synthetic data generation, optimizing component design, and enhancing human machine interaction via personalized and adaptive interfaces. At the same time, the paper identifies significant technical, ethical, and safety challenges, including computational demands, bias, intellectual property concerns, and adversarial robustness, that must be addressed for responsible deployment. A case study on Mercedes Benzs MBUX Virtual Assistant illustrates how GenAI powered voice systems deliver more natural, proactive, and personalized in car interactions compared to legacy rule based assistants. Through this review and case study, the paper outlines both the promise and limitations of GenAI integration in the automotive sector and presents directions for future research and development aimed at achieving safer, more efficient, and user centric mobility. Unlike prior reviews that focus solely on perception or manufacturing, this paper emphasizes generative AI in voice based HMI, bridging safety and user experience perspectives.
- Transportation > Ground > Road (1.00)
- Information Technology > Security & Privacy (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- Government > Military (0.93)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
MultiVox: A Benchmark for Evaluating Voice Assistants for Multimodal Interactions
Selvakumar, Ramaneswaran, Seth, Ashish, Anand, Nishit, Tyagi, Utkarsh, Kumar, Sonal, Ghosh, Sreyan, Manocha, Dinesh
The rapid progress of Large Language Models (LLMs) has empowered omni models to act as voice assistants capable of understanding spoken dialogues. These models can process multimodal inputs beyond text, such as speech and visual data, enabling more context-aware interactions. However, current benchmarks fall short in comprehensively evaluating how well these models generate context-aware responses, particularly when it comes to implicitly understanding fine-grained speech characteristics, such as pitch, emotion, timbre, and volume or the environmental acoustic context such as background sounds. Additionally, they inadequately assess the ability of models to align paralinguistic cues with complementary visual signals to inform their responses. To address these gaps, we introduce MultiVox, the first omni voice assistant benchmark designed to evaluate the ability of voice assistants to integrate spoken and visual cues including paralinguistic speech features for truly multimodal understanding. Specifically, MultiVox includes 1000 human-annotated and recorded speech dialogues that encompass diverse paralinguistic features and a range of visual cues such as images and videos. Our evaluation on 10 state-of-the-art models reveals that, although humans excel at these tasks, current models consistently struggle to produce contextually grounded responses.
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Asia > Singapore (0.04)
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
Wang, Ke, Ren, Houxing, Lu, Zimu, Zhan, Mingjie, Li, Hongsheng
The growing capabilities of large language models and multimodal systems have spurred interest in voice-first AI assistants, yet existing benchmarks are inadequate for evaluating the full range of these systems' capabilities. We introduce VoiceAssistant-Eval, a comprehensive benchmark designed to assess AI assistants across listening, speaking, and viewing. VoiceAssistant-Eval comprises 10,497 curated examples spanning 13 task categories. These tasks include natural sounds, music, and spoken dialogue for listening; multi-turn dialogue, role-play imitation, and various scenarios for speaking; and highly heterogeneous images for viewing. To demonstrate its utility, we evaluate 21 open-source models and GPT-4o-Audio, measuring the quality of the response content and speech, as well as their consistency. The results reveal three key findings: (1) proprietary models do not universally outperform open-source models; (2) most models excel at speaking tasks but lag in audio understanding; and (3) well-designed smaller models can rival much larger ones. Notably, the mid-sized Step-Audio-2-mini (7B) achieves more than double the listening accuracy of LLaMA-Omni2-32B-Bilingual. However, challenges remain: multimodal (audio plus visual) input and role-play voice imitation tasks are difficult for current models, and significant gaps persist in robustness and safety alignment. VoiceAssistant-Eval identifies these gaps and establishes a rigorous framework for evaluating and guiding the development of next-generation AI assistants. Code and data will be released at https://mathllm.github.io/VoiceAssistantEval/ .
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Belgium (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Personal > Interview (0.93)
- Research Report > Experimental Study (0.93)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine (1.00)
- (3 more...)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- (2 more...)
Speechless: Speech Instruction Training Without Speech for Low Resource Languages
Dao, Alan, Vu, Dinh Bach, Ha, Huy Hoang, Anh, Tuan Le Duc, Gopal, Shreyas, Yeo, Yue Heng, Low, Warren Keng Hoong, Chng, Eng Siong, Yip, Jia Qi
The rapid growth of voice assistants powered by large language models (LLM) has highlighted a need for speech instruction data to train these systems. Despite the abundance of speech recognition data, there is a notable scarcity of speech instruction data, which is essential for fine-tuning models to understand and execute spoken commands. Generating high-quality synthetic speech requires a good text-to-speech (TTS) model, which may not be available to low resource languages. Our novel approach addresses this challenge by halting synthesis at the semantic representation level, bypassing the need for TTS. We achieve this by aligning synthetic semantic representations with the pre-trained Whisper encoder, enabling an LLM to be fine-tuned on text instructions while maintaining the ability to understand spoken instructions during inference. This simplified training process is a promising approach to building voice assistant for low-resource languages.
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
TextOnly: A Unified Function Portal for Text-Related Functions on Smartphones
Tu, Minghao, Yu, Chun, Shen, Xiyuan, Zheng, Zhi, Chen, Li, Shi, Yuanchun
Text boxes serve as portals to diverse functionalities in today's smartphone applications. However, when it comes to specific functionalities, users always need to navigate through multiple steps to access particular text boxes for input. We propose TextOnly, a unified function portal that enables users to access text-related functions from various applications by simply inputting text into a sole text box. For instance, entering a restaurant name could trigger a Google Maps search, while a greeting could initiate a conversation in WhatsApp. Despite their brevity, TextOnly maximizes the utilization of these raw text inputs, which contain rich information, to interpret user intentions effectively. TextOnly integrates large language models(LLM) and a BERT model. The LLM consistently provides general knowledge, while the BERT model can continuously learn user-specific preferences and enable quicker predictions. Real-world user studies demonstrated TextOnly's effectiveness with a top-1 accuracy of 71.35%, and its ability to continuously improve both its accuracy and inference speed. Participants perceived TextOnly as having satisfactory usability and expressed a preference for TextOnly over manual executions. Compared with voice assistants, TextOnly supports a greater range of text-related functions and allows for more concise inputs.
- North America > United States > Washington > King County > Seattle (0.14)
- Asia > China > Beijing > Beijing (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
Major Philips Hue leak reveals 'Pro' hub with a killer feature
Philips Hue appears to be teeing up a new, more powerful hub that can turn Hue bulbs into motion sensors, according to leaked details and images that briefly appeared on Philips Hue's own website. The unannounced products, which have since been yanked from the "New on Hue" page, included the "faster" Hue Bridge Pro as well as a wired video doorbell, a refreshed and more efficient A19 bulb, permanent and globe-style versions of Hue's Festavia outdoor string lights, a gradient light strip, and the ability to control your Hue lights with the Sonos voice assistant. No pricing details were included in the leaked details, which were live on the Hue website for several hours Wednesday. The leaked products were initially spotted by users on Reddit. Reached by TechHive, a Phillips Hue spokesperson declined to comment.
- Information Technology > Communications > Networks (0.39)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.37)
Garmin Forerunner 570 review: running watch stumbles just short of greatness
Garmin's latest mid-range running and multisport watch has smartened up with a very bright OLED screen, voice assistant and upgraded sensors. The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. The Forerunner 570 continues the revamp of the company's running watches, which have all gained more accurate GPS chips and improved heart rate monitors. The new model replaces the popular 265 and sits under the 970. It offers a similar look and feel to the top watch but with a few key features removed for a lower price.
- Information Technology > Hardware (0.54)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.36)