AITopics | Pine, James

Collaborating Authors

Pine, James

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

REGEN: A Dataset and Benchmarks with Natural Language Critiques and Narratives

Su, Kun, Sayana, Krishna, Pham, Hubert, Pine, James, Vasilevski, Yuri, Vasudeva, Raghavendra, Kyriakidi, Marialena, Hebert, Liam, Jash, Ambarish, Subbiah, Anushya, Sodhi, Sukhdeep

arXiv.org Artificial IntelligenceMar-14-2025

This paper introduces a novel dataset REGEN (Reviews Enhanced with GEnerative Narratives), designed to benchmark the conversational capabilities of recommender Large Language Models (LLMs), addressing the limitations of existing datasets that primarily focus on sequential item prediction. REGEN extends the Amazon Product Reviews dataset by inpainting two key natural language features: (1) user critiques, representing user "steering" queries that lead to the selection of a subsequent item, and (2) narratives, rich textual outputs associated with each recommended item taking into account prior context. The narratives include product endorsements, purchase explanations, and summaries of user preferences. Further, we establish an end-to-end modeling benchmark for the task of conversational recommendation, where models are trained to generate both recommendations and corresponding narratives conditioned on user history (items and critiques). For this joint task, we introduce a modeling framework LUMEN (LLM-based Unified Multi-task Model with Critiques, Recommendations, and Narratives) which uses an LLM as a backbone for critiquing, retrieval and generation. We also evaluate the dataset's quality using standard auto-rating techniques and benchmark it by training both traditional and LLM-based recommender models. Our results demonstrate that incorporating critiques enhances recommendation quality by enabling the recommender to learn language understanding and integrate it with recommendation signals. Furthermore, LLMs trained on our dataset effectively generate both recommendations and contextual narratives, achieving performance comparable to state-of-the-art recommenders and language models.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.11924

Country:

Asia (0.93)
North America > United States (0.72)
North America > Canada > Ontario (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Beyond Retrieval: Generating Narratives in Conversational Recommender Systems

Sayana, Krishna, Vasudeva, Raghavendra, Vasilevski, Yuri, Su, Kun, Hebert, Liam, Pine, James, Pham, Hubert, Jash, Ambarish, Sodhi, Sukhdeep

arXiv.org Artificial IntelligenceDec-10-2024

The recent advances in Large Language Model's generation and reasoning capabilities present an opportunity to develop truly conversational recommendation systems. However, effectively integrating recommender system knowledge into LLMs for natural language generation which is tailored towards recommendation tasks remains a challenge. This paper addresses this challenge by making two key contributions. First, we introduce a new dataset (REGEN) for natural language generation tasks in conversational recommendations. REGEN (Reviews Enhanced with GEnerative Narratives) extends the Amazon Product Reviews dataset with rich user narratives, including personalized explanations of product preferences, product endorsements for recommended items, and summaries of user purchase history. REGEN is made publicly available to facilitate further research. Furthermore, we establish benchmarks using well-known generative metrics, and perform an automated evaluation of the new dataset using a rater LLM. Second, the paper introduces a fusion architecture (CF model with an LLM) which serves as a baseline for REGEN. And to the best of our knowledge, represents the first attempt to analyze the capabilities of LLMs in understanding recommender signals and generating rich narratives. We demonstrate that LLMs can effectively learn from simple fusion architectures utilizing interaction-based CF embeddings, and this can be further enhanced using the metadata and personalization data associated with items. Our experiments show that combining CF and content embeddings leads to improvements of 4-12% in key language metrics compared to using either type of embedding individually. We also provide an analysis to interpret how CF and content embeddings contribute to this new generative task.

history, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.1678

Country: North America > United States (0.68)

Genre: Research Report (0.81)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Minimizing Live Experiments in Recommender Systems: User Simulation to Evaluate Preference Elicitation Policies

Hsu, Chih-Wei, Mladenov, Martin, Meshi, Ofer, Pine, James, Pham, Hubert, Li, Shane, Liang, Xujian, Polishko, Anton, Yang, Li, Scheetz, Ben, Boutilier, Craig

arXiv.org Artificial IntelligenceSep-25-2024

Evaluation of policies in recommender systems typically involves A/B testing using live experiments on real users to assess a new policy's impact on relevant metrics. This ``gold standard'' comes at a high cost, however, in terms of cycle time, user cost, and potential user retention. In developing policies for ``onboarding'' new users, these costs can be especially problematic, since on-boarding occurs only once. In this work, we describe a simulation methodology used to augment (and reduce) the use of live experiments. We illustrate its deployment for the evaluation of ``preference elicitation'' algorithms used to onboard new users of the YouTube Music platform. By developing counterfactually robust user behavior models, and a simulation service that couples such models with production infrastructure, we are able to test new algorithms in a way that reliably predicts their performance on key metrics when deployed live. We describe our domain, our simulation models and platform, results of experiments and deployment, and suggest future steps needed to further realistic simulation as a powerful complement to live experiments.

artificial intelligence, artist, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3661358

2409.17436

Country:

Asia (0.93)
North America > Canada (0.67)
North America > United States > Massachusetts (0.28)
Oceania > Australia > Victoria > Melbourne (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media (0.68)
Information Technology > Services (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback