PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback
Coca, Alexandru, Tseng, Bo-Hsiang, Boothroyd, Pete, Cheng, Jianpeng, Gaynor, Mark, Zhang, Zhenxing, Stacey, Joe, Guigue, Tristan, Alonso, Héctor Martinez, Séaghdha, Diarmuid Ó, Johannsen, Anders
–arXiv.org Artificial Intelligence
Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track dialogue state and uses policy and execution feedback for efficient error correction. To this end, PyTOD employs a simple constrained decoding approach, using a language model instead of grammar rules to follow API schemata. This leads to state-of-the-art state tracking performance on the challenging SGD benchmark. Our experiments show that PyTOD surpasses strong baselines in both accuracy and robust user goal estimation as the dialogue progresses, demonstrating the effectiveness of execution-aware state tracking.
arXiv.org Artificial Intelligence
Aug-22-2025
- Country:
- Asia
- Japan > Honshū
- Kansai > Kyoto Prefecture > Kyoto (0.04)
- Laos (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Japan > Honshū
- Europe
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Czechia > Prague (0.04)
- France (0.04)
- Italy (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Croatia > Dubrovnik-Neretva County
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Washington > King County
- Seattle (0.04)
- California > San Diego County
- Canada > Ontario
- Asia
- Genre:
- Research Report (1.00)
- Technology: