Concept -- An Evaluation Protocol on Conversational Recommender Systems with System-centric and User-centric Factors

Huang, Chen, Qin, Peixin, Deng, Yang, Lei, Wenqiang, Lv, Jiancheng, Chua, Tat-Seng

May-6-2024–arXiv.org Artificial Intelligence

The conversational recommendation system (CRS) has been criticized regarding its user experience in real-world scenarios, despite recent significant progress achieved in academia. Existing evaluation protocols for CRS may prioritize system-centric factors such as effectiveness and fluency in conversation while neglecting user-centric aspects. Thus, we propose a new and inclusive evaluation protocol, Concept, which integrates both system- and user-centric factors. We conceptualise three key characteristics in representing such factors and further divide them into six primary abilities. To implement Concept, we adopt a LLM-based user simulator and evaluator with scoring rubrics that are tailored for each primary ability. Our protocol, Concept, serves a dual purpose. First, it provides an overview of the pros and cons in current CRS models. Second, it pinpoints the problem of low usability in the "omnipotent" ChatGPT and offers a comprehensive reference guide for evaluating CRS, thereby setting the foundation for CRS improvement.

age group, dataset, recommendation, (14 more...)

arXiv.org Artificial Intelligence

May-6-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Italy > Tuscany
    - Florence (0.04)
- Asia
  - Singapore (0.04)
  - Indonesia > Bali (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report > New Finding (1.00)
- Overview (1.00)

Industry:
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Law (0.92)
- Information Technology (0.67)
- Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Personal Assistant Systems (1.00)
    - Agents (0.92)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found