Goto

Collaborating Authors

 Optimization



Multi-Objective Intrinsic Reward Learning for Conversational Recommender Systems

Neural Information Processing Systems

Conversational Recommender Systems (CRS) actively elicit user preferences to generate adaptive recommendations. Mainstream reinforcement learning-based CRS solutions heavily rely on handcrafted reward functions, which may not be aligned with user intent in CRS tasks.


A Related Work

Neural Information Processing Systems

When these weighting functions output constant, we can infer that the cost function is a linear transformation of AUC. W AUC. The idea of weighting thresholds in AUC is first described by [ Bilevel optimization is a classical algorithm for operations research. B.1 Main Idea of Experiments Our experiments mainly explore the following three problems: Traditional AUC is inconsistent with the cost-related metrics and cannot be used in cost-sensitive learning scenarios. From the experimental results in our paper, we can see that most AUC optimization methods do not minimize the misclassification cost. Ultimately, the misclassification cost of the decision is not acceptable.