Goto

Collaborating Authors

 Personal Assistant Systems


Billion-scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering

arXiv.org Artificial Intelligence

Similarity search, the task of finding similar vectors, has become a fundamental operation in machine learning, with applications in recommendation engines, semantic search systems, and more [1-3]. As datasets grow to billions of entries, the challenge of performing efficient searches on high-dimensional vectors becomes increasingly complex [4]. This is further compounded by the well-known curse of dimensionality [5], which affects the performance and accuracy of search algorithms as the number of dimensions increases. Approximate Nearest Neighbor (ANN) algorithms, such as Inverted File Index (IVF) [6] and Hierarchical Navigable Small World (HNSW) [7], have been developed to address scalability and performance issues. IVF segments the search space into smaller areas, called Voronoi cells [8], while HNSW constructs a navigable graph structure for efficient search space traversal. Despite their advancements, these methods often struggle to support complex, multi-dimensional filtering efficiently. This is crucial in practical scenarios where additional criteria beyond vector similarity are required to refine search results [6]. Examples of such scenarios include e-commerce product search and semantic search with filtering and recommendation systems.


Can Large Language Models Understand Preferences in Personalized Recommendation?

arXiv.org Artificial Intelligence

Large Language Models (LLMs) excel in various tasks, including personalized recommendations. Existing evaluation methods often focus on rating prediction, relying on regression errors between actual and predicted ratings. However, user rating bias and item quality, two influential factors behind rating scores, can obscure personal preferences in user-item pair data. To address this, we introduce PerRecBench, disassociating the evaluation from these two factors and assessing recommendation techniques on capturing the personal preferences in a grouped ranking manner. We find that the LLM-based recommendation techniques that are generally good at rating prediction fail to identify users' favored and disfavored items when the user rating bias and item quality are eliminated by grouping users. With PerRecBench and 19 LLMs, we find that while larger models generally outperform smaller ones, they still struggle with personalized recommendation. Our findings reveal the superiority of pairwise and listwise ranking approaches over pointwise ranking, PerRecBench's low correlation with traditional regression metrics, the importance of user profiles, and the role of pretraining data distributions. We further explore three supervised fine-tuning strategies, finding that merging weights from single-format training is promising but improving LLMs' understanding of user preferences remains an open research problem. Code and data are available at https://github.com/TamSiuhin/PerRecBench


GRETA: Modular Platform to Create Adaptive Socially Interactive Agents

arXiv.org Artificial Intelligence

The interaction between humans is very complex to describe since it is composed of different elements from different modalities such as speech, gaze, and gestures influenced by social attitudes and emotions. Furthermore, the interaction can be affected by some features which refer to the interlocutor's state. Actual Socially Interactive Agents SIAs aim to adapt themselves to the state of the interaction partner. In this paper, we discuss this adaptation by describing the architecture of the GRETA platform which considers external features while interacting with humans and/or another ECA and process the dialogue incrementally. We illustrate the new architecture of GRETA which deals with the external features, the adaptation, and the incremental approach for the dialogue processing.


Reviews: Joint Optimization of Tree-based Index and Deep Model for Recommender Systems

Neural Information Processing Systems

The results presented for this work beats the benchmarks by a good deal, and in particular, the online test results are very good. It is an incremental improvement to an existing model (TDM) by doing an additional optimization step. The resulting improvement is impressive, though, and it feels like this would be more applicable to an applied data science conference such as KDD or WWW. The explanation of TDM in Section 2.1 is helpful, but it would be even more helpful to have a direct comparison between the tree building steps between TDM and the new proposed method. For example, having a side-by-side comparison of Algorithms1 & 2 with its TDM predecessor would go a long way in understanding detailed differences.


Reviews: Joint Optimization of Tree-based Index and Deep Model for Recommender Systems

Neural Information Processing Systems

The review scores were somewhat borderline, but overall slightly above the acceptance threshold. There was some disagreement among the reviewers, following which a discussion was initiated. The rebuttal largely addresses the concerns of R1 (the most negative review), and in the metareviewer's opinion does a reasonable job of addressing these concerns, which are mostly clarifications regarding the performance of the algorithm. Positively, the reviewers mostly concur that the method, while fairly straightforward, offers significant improvements over existing techniques. After discussion there was some positive movement in review scores resulting in a positive consensus among reviewers.


Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review

arXiv.org Artificial Intelligence

This study proposes a data-driven framework for enhancing the accuracy and efficiency of scientific peer review through an open, bottom-up process that estimates reviewer quality. Traditional closed peer review systems, while essential for quality control, are often slow, costly, and subject to biases that can impede scientific progress. Here, we introduce a method that evaluates individual reviewer reliability by quantifying agreement with community consensus scores and applying Bayesian weighting to refine paper quality assessments. We analyze open peer review data from two major scientific conferences, and demonstrate that reviewer-specific quality scores significantly improve the reliability of paper quality estimation. Perhaps surprisingly, we find that reviewer quality scores are unrelated to authorship quality. Our model incorporates incentive structures to recognize high-quality reviewers and encourage broader coverage of submitted papers, thereby mitigating the common "rich-get-richer" pitfall of social media. These findings suggest that open peer review, with mechanisms for estimating and incentivizing reviewer quality, offers a scalable and equitable alternative for scientific publishing, with potential to enhance the speed, fairness, and transparency of the peer review process.


Stop talking to your phone: How to use Type to Siri

Popular Science

Among the changes ushered in with iOS 18.1, iPadOS 18.1, and macOS 15.1 Sequoia is a new Type to Siri option. This means you can carry on a conversation with Apple's digital assistant without having to talk out loud, which is helpful when you're in a quiet library, busy subway car, or anywhere else you can't really use voice control. The ability to type to Siri has actually been available on Apple devices for several years now, but previously it was hidden away in the Accessibility settings and not all that easy to find. Now Apple has given it much more prominence in its operating systems, so typing is just as straightforward as talking. Breakthroughs, discoveries, and DIY tips sent every weekday.


Review for NeurIPS paper: Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Neural Information Processing Systems

Weaknesses: I don't think the proposed algorithm to leverage the variance is fully sound. The user memory set Mu is updated on each iteration by first uniformly sampling additional items and then retaining those with higher scores (with probability proportional to softmax of the score). First, for datasets that have lots of items, uniform sampling is very unlikely to produce hard negatives with high scores so this procedure can be highly inefficient. Second, since new samples are likely to have lower scores, one either has to increase the temperature or leave Mu relatively static between iterations. If Mu is static then training can saturate and the model can overfit to these negative examples.


Review for NeurIPS paper: Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Neural Information Processing Systems

The initial reviews were mixed for this paper. However, during the discussion, a certain consensus emerged regarding the value of this contribution. In particular, the reviews agree that the proposed method is simple and effective. In the proposed study, the method seems to outperform (by a small margin) others consistently. The main negative is the high computational and memory cost of the approach.


The best smart home gadgets for 2025

Engadget

If it feels like every piece of home tech is now "smart," you're not far off. The smart home space has grown exponentially in the past few years to include speakers, cameras, locks, lights and even kitchen appliances. There are also different voice assistants and IoT standards to consider, all of which can make it confusing (to say the least) to build your smart home ecosystem from the ground up. Allow us at Engadget to help with that. We've tested dozens of smart home gadgets over the years and continue to test the latest offerings to see which work well and are worth your money. We recommend, before you even dive in, to resist the urge to outfit your whole home in one go.