Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Oct-9-2024, 21:01:01 GMT–Neural Information Processing Systems

Reinforcement learning (RL) has shown great promise for developing agents for dialogue management (DM) that are non-myopic, conduct rich conversations, and maximize overall user satisfaction. Despite the advancements in RL and language models (LMs), employing RL to drive conversational chatbots still poses significant challenges. A primary issue stems from RL's dependency on online exploration for effective learning, a process that can be costly. Moreover, engaging in online interactions with humans during the training phase can raise safety concerns, as the LM can potentially generate unwanted outputs. This issue is exacerbated by the combinatorial action spaces facing these algorithms, as most LM agents generate responses at the word level. We develop various RL algorithms, specialized in dialogue planning, that leverage recent Mixture-of-Expert Language Models (MoE-LMs)---models that capture diverse semantics, generate utterances reflecting different intents, and are amenable for multi-turn DM.

action space, mixture-of-expert dialogue management, offline reinforcement learning, (1 more...)

Neural Information Processing Systems

Oct-9-2024, 21:01:01 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.99)
  - Natural Language
    - Discourse & Dialogue (0.65)
    - Chatbot (0.63)