Goto

Collaborating Authors

 Personal Assistant Systems


The Google Home Speaker is real, but we'll have to wait for it

PCWorld

When you purchase through links in our articles, we may earn a small commission. The Google Home Speaker is real, but we'll have to wait for it Teased during a Pixel demonstration back in August, Google's latest smart speaker is finally official, but it's not coming out right away. Weeks after sneaking it into a product demo and teasing it on social media, Google has finally taken the wraps off its first new smart speaker in five years, but we'll have to wait a little longer before we can get our hands on one. The $99 Google Home Speaker--yes, Google dropped the "Nest" branding for the new device--has been built for Gemini, Google says, and it boasts features like 360-degree audio and the ability to pair with the Google TV Streamer. But unlike Google's new Nest security cameras (Google is sticking with the "Nest" moniker for its smart cams, at least for now), which are available for purchase now, the Google Home Speaker won't go on sale until spring 2026.


Google just rebranded its Nest Aware subscription service

PCWorld

When you purchase through links in our articles, we may earn a small commission. Say goodbye to Nest Aware, and hello to Google Home Premium. Lots of change is coming to Google Home in the coming weeks, from the replacement of Google Assistant with Gemini for Home to an overhaul of its smart home subscription service, complete with a new name. Google is rebranding its Nest Aware plans as Google Home Premium, adding a bevy of Gemini-powered features to the service's two tiers (Standard and Advanced), but it's not chaning the prices for those tiers. So yes, the Google Home Premium reboot doesn't include any price hikes, but that shouldn't come as a surprise, given that Google just boosted the price of its former Nest Aware tiers back in July--perhaps in anticipation of the coming changes.


Google's Gemini Arrives in Google Home, Alongside New Speaker, Nest Cam, and Nest Doorbell

WIRED

The Google Home app has seen a host of small improvements over the last few months to enhance stability and polish. However, Google says with this new update, you can expect to see 70 percent faster startup with the app and 80 percent fewer crashes. There are battery life and memory optimizations, and scrubbing through the camera is six times smoother. The Home app is designed to be used easily one-handed, and you can use more gestures, such as swiping down to enter a camera view or swiping up to back out. You now get preview images from the last event before the live view loads, the ability to swipe between timeline and events, double-tap to fast forward or rewind, and better notifications with a static thumbnail expandable to a large animated preview. Google says the merger of features and devices from the old Nest app is now complete, and folks should be able to transition seamlessly, though the legacy app won't be disappearing yet.


Learning Unified User Quantized Tokenizers for User Representation

arXiv.org Artificial Intelligence

Multi-source user representation learning plays a critical role in enabling personalized services on web platforms (e.g., Alipay). While prior works have adopted late-fusion strategies to combine heterogeneous data sources, they suffer from three key limitations: lack of unified representation frameworks, scalability and storage issues in data compression, and inflexible cross-task generalization. To address these challenges, we propose U2QT (Unified User Quantized Tokenizers), a novel framework that integrates cross-domain knowledge transfer with early fusion of heterogeneous domains. Our framework employs a two-stage architecture: first, we use the Qwen3 Embedding model to derive a compact yet expressive feature representation; second, a multi-view RQ-VAE discretizes causal embeddings into compact tokens through shared and source-specific codebooks, enabling efficient storage while maintaining semantic coherence. Experimental results showcase U2QT's advantages across diverse downstream tasks, outperforming task-specific baselines in future behavior prediction and recommendation tasks while achieving efficiency gains in storage and computation. The unified tokenization framework enables seamless integration with language models and supports industrial-scale applications.


AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models

arXiv.org Artificial Intelligence

Audio Large Language Models (ALLMs) have gained widespread adoption, yet their trustworthiness remains underexplored. Existing evaluation frameworks, designed primarily for text, fail to address unique vulnerabilities introduced by audio's acoustic properties. We identify significant trustworthiness risks in ALLMs arising from non-semantic acoustic cues, including timbre, accent, and background noise, which can manipulate model behavior. We propose AudioTrust, a comprehensive framework for systematic evaluation of ALLM trustworthiness across audio-specific risks. AudioTrust encompasses six key dimensions: fairness, hallucination, safety, privacy, robustness, and authentication. The framework implements 26 distinct sub-tasks using a curated dataset of over 4,420 audio samples from real-world scenarios, including daily conversations, emergency calls, and voice assistant interactions. We conduct comprehensive evaluations across 18 experimental configurations using human-validated automated pipelines. Our evaluation of 14 state-of-the-art open-source and closed-source ALLMs reveals significant limitations when confronted with diverse high-risk audio scenarios, providing insights for secure deployment of audio models. Code and data are available at https://github.com/JusperLee/AudioTrust.


Hands on with Amazon's new AI-enhanced Echo smart speakers and displays

PCWorld

When you purchase through links in our articles, we may earn a small commission. The Echo Studio is back, along with the revamped Echo Dot Max and two gorgeous Echo Show displays. Yes, round is still in when it comes to Amazon's refreshed Echo smart speakers, with the high-end Echo Studio and the smaller Echo Dot both getting big Alexa+ makeovers at Amazon's big fall hardware event in New York City. Also on tap were new versions of Amazon's Echo Show 8 and 11 displays, which chopped the chunky design of previous-generation Echo Shows in favor of slimmed-down screens mounted in front of oval-shaped rear speaker components. Available for pre-order now, the new Echo devices pack Amazon's new AZ3 and AZ3 Pro chips allowing for "on the edge" Alexa+ processing, ideal for getting speedier replies while enabling advanced sensors that allow the new Alexa to sense what's going on in the immediate area.


Everything Amazon Announced Today at Its Fall Hardware Event (2025)

WIRED

Amazon's next-gen Alexa+ chatbot is now available in four new Echo devices and a bevy of Ring cameras. The company also debuted three new Kindle Scribe tablets, one with a color screen. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. It got a large language model power-up earlier this year in the form of Alexa+ (a paid upgrade for non-Amazon Prime subscribers), and now, Amazon has fresh hardware to take advantage of the assistant's new capabilities.


Deep content-based music recommendation

Neural Information Processing Systems

Automatic music recommendation has become an increasingly relevant problem in recent years, since a lot of music is now sold and consumed digitally. Most recommender systems rely on collaborative filtering. However, this approach suffers from the cold start problem: it fails when no usage data is available, so it is not effective for recommending new and unpopular songs. In this paper, we propose to use a latent factor model for recommendation, and predict the latent factors from music audio when they cannot be obtained from usage data. We compare a traditional approach using a bag-of-words representation of the audio signals with deep convolutional neural networks, and evaluate the predictions quantitatively and qualitatively on the Million Song Dataset. We show that using predicted latent factors produces sensible recommendations, despite the fact that there is a large semantic gap between the characteristics of a song that affect user preference and the corresponding audio signal. We also show that recent advances in deep learning translate very well to the music recommendation setting, with deep convolutional neural networks significantly outperforming the traditional approach.


A Latent Source Model for Online Collaborative Filtering

Neural Information Processing Systems

Despite the prevalence of collaborative filtering in recommendation systems, there has been little theoretical development on why and how well it works, especially in the ``online'' setting, where items are recommended to users over time. We address this theoretical gap by introducing a model for online recommendation systems, cast item recommendation under the model as a learning problem, and analyze the performance of a cosine-similarity collaborative filtering method. In our model, each of $n$ users either likes or dislikes each of $m$ items. We assume there to be $k$ types of users, and all the users of a given type share a common string of probabilities determining the chance of liking each item. At each time step, we recommend an item to each user, where a key distinction from related bandit literature is that once a user consumes an item (e.g., watches a movie), then that item cannot be recommended to the same user again. The goal is to maximize the number of likable items recommended to users over time. Our main result establishes that after nearly $\log(km)$ initial learning time steps, a simple collaborative filtering algorithm achieves essentially optimal performance without knowing $k$. The algorithm has an exploitation step that uses cosine similarity and two types of exploration steps, one to explore the space of items (standard in the literature) and the other to explore similarity between users (novel to this work).


Controlling privacy in recommender systems

Neural Information Processing Systems

Recommender systems involve an inherent trade-off between accuracy of recommendations and the extent to which users are willing to release information about their preferences. In this paper, we explore a two-tiered notion of privacy where there is a small set of private'' users who require privacy guarantees. We show theoretically and demonstrate empirically that a moderate number of public users with no access to private user information already suffices for reasonable accuracy. Moreover, we introduce a new privacy concept for gleaning relational information from private users while maintaining a first order deniability. We demonstrate gains from controlled access to private user preferences.