system action
Rule-Based Moral Principles for Explaining Uncertainty in Natural Language Generation
Abstract--Rule-Based Moral Principles for Explaining Uncertainty in Natural Language Generation As large language models (LLMs) are increasingly used in high-stakes applications, the challenge of explaining uncertainty in natural language generation has become both a technical and moral imperative. Traditional approaches rely on probabilistic methods that are often opaque, difficult to interpret, and misaligned with human expectations of transparency and accountability. In response to these limitations, this paper introduces a novel framework based on rule-based moral principles--simple, human-inspired ethical guidelines--for responding to uncertainty in LLM-generated text. Drawing on insights from experimental moral psychology and virtue ethics, we define a set of symbolic behavioral rules such as precaution, deference, and responsibility to guide system responses under conditions of epistemic or aleatoric uncertainty. These rules are implemented declaratively and are designed to generate adaptive, context-sensitive explanations even in the absence of precise confidence metrics. The moral principles are encoded as symbolic rules within a lightweight Prolog-based engine, where each uncertainty tag (low, medium, high) activates an ethically aligned system action along with an automatically generated, plain-language rationale. We evaluate the framework through scenario-based simulations that benchmark rule coverage, assess fairness implications, and analyze trust calibration. An interpretive explanation module is integrated to reveal both the assigned uncertainty level and its underlying justification in a transparent and accessible way. We illustrate the framework through hypothetical yet plausible use cases in clinical and legal domains, demonstrating how rule-based moral reasoning can enhance user trust, promote fairness, and improve the interpretability of AI-generated language. By offering a lightweight, philosophically grounded alternative to probabilistic uncertainty modeling, our approach paves the way for more ethical, human-aligned, and socially responsible natural language generation.
PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback
Coca, Alexandru, Tseng, Bo-Hsiang, Boothroyd, Pete, Cheng, Jianpeng, Gaynor, Mark, Zhang, Zhenxing, Stacey, Joe, Guigue, Tristan, Alonso, Héctor Martinez, Séaghdha, Diarmuid Ó, Johannsen, Anders
Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track dialogue state and uses policy and execution feedback for efficient error correction. To this end, PyTOD employs a simple constrained decoding approach, using a language model instead of grammar rules to follow API schemata. This leads to state-of-the-art state tracking performance on the challenging SGD benchmark. Our experiments show that PyTOD surpasses strong baselines in both accuracy and robust user goal estimation as the dialogue progresses, demonstrating the effectiveness of execution-aware state tracking.
Reasoning about Actual Causes in Nondeterministic Domains -- Extended Version
Khan, Shakil M., Lespérance, Yves, Rostamigiv, Maryam
Reasoning about the causes behind observations is crucial to the formalization of rationality. While extensive research has been conducted on root cause analysis, most studies have predominantly focused on deterministic settings. In this paper, we investigate causation in more realistic nondeterministic domains, where the agent does not have any control on and may not know the choices that are made by the environment. We build on recent preliminary work on actual causation in the nondeterministic situation calculus to formalize more sophisticated forms of reasoning about actual causes in such domains. We investigate the notions of ``Certainly Causes'' and ``Possibly Causes'' that enable the representation of actual cause for agent actions in these domains. We then show how regression in the situation calculus can be extended to reason about such notions of actual causes.
PerSHOP -- A Persian dataset for shopping dialogue systems modeling
Mahmoudi, Keyvan, Faili, Heshaam
Nowadays, dialogue systems are used in many fields of industry and research. There are successful instances of these systems, such as Apple Siri, Google Assistant, and IBM Watson. Task-oriented dialogue system is a category of these, that are used in specific tasks. They can perform tasks such as booking plane tickets or making restaurant reservations. Shopping is one of the most popular areas on these systems. The bot replaces the human salesperson and interacts with the customers by speaking. To train the models behind the scenes of these systems, annotated data is needed. In this paper, we developed a dataset of dialogues in the Persian language through crowd-sourcing. We annotated these dialogues to train a model. This dataset contains nearly 22k utterances in 15 different domains and 1061 dialogues. This is the largest Persian dataset in this field, which is provided freely so that future researchers can use it. Also, we proposed some baseline models for natural language understanding (NLU) tasks. These models perform two tasks for NLU: intent classification and entity extraction. The F-1 score metric obtained for intent classification is around 91% and for entity extraction is around 93%, which can be a baseline for future research.
Abstraction of Nondeterministic Situation Calculus Action Theories -- Extended Version
Banihashemi, Bita, De Giacomo, Giuseppe, Lespérance, Yves
We develop a general framework for abstracting the behavior of an agent that operates in a nondeterministic domain, i.e., where the agent does not control the outcome of the nondeterministic actions, based on the nondeterministic situation calculus and the ConGolog programming language. We assume that we have both an abstract and a concrete nondeterministic basic action theory, and a refinement mapping which specifies how abstract actions, decomposed into agent actions and environment reactions, are implemented by concrete ConGolog programs. This new setting supports strategic reasoning and strategy synthesis, by allowing us to quantify separately on agent actions and environment reactions. We show that if the agent has a (strong FOND) plan/strategy to achieve a goal/complete a task at the abstract level, and it can always execute the nondeterministic abstract actions to completion at the concrete level, then there exists a refinement of it that is a (strong FOND) plan/strategy to achieve the refinement of the goal/task at the concrete level.
Zero-Shot Generalizable End-to-End Task-Oriented Dialog System using Context Summarization and Domain Schema
Mosharrof, Adib, Maqbool, M. H., Siddique, A. B.
Task-oriented dialog systems empower users to accomplish their goals by facilitating intuitive and expressive natural language interactions. State-of-the-art approaches in task-oriented dialog systems formulate the problem as a conditional sequence generation task and fine-tune pre-trained causal language models in the supervised setting. This requires labeled training data for each new domain or task, and acquiring such data is prohibitively laborious and expensive, thus making it a bottleneck for scaling systems to a wide range of domains. To overcome this challenge, we introduce a novel Zero-Shot generalizable end-to-end Task-oriented Dialog system, ZS-ToD, that leverages domain schemas to allow for robust generalization to unseen domains and exploits effective summarization of the dialog history. We employ GPT-2 as a backbone model and introduce a two-step training process where the goal of the first step is to learn the general structure of the dialog data and the second step optimizes the response generation as well as intermediate outputs, such as dialog state and system actions. As opposed to state-of-the-art systems that are trained to fulfill certain intents in the given domains and memorize task-specific conversational patterns, ZS-ToD learns generic task-completion skills by comprehending domain semantics via domain schemas and generalizing to unseen domains seamlessly. We conduct an extensive experimental evaluation on SGD and SGD-X datasets that span up to 20 unique domains and ZS-ToD outperforms state-of-the-art systems on key metrics, with an improvement of +17% on joint goal accuracy and +5 on inform. Additionally, we present a detailed ablation study to demonstrate the effectiveness of the proposed components and training mechanism
AnyTOD: A Programmable Task-Oriented Dialog System
Zhao, Jeffrey, Cao, Yuan, Gupta, Raghav, Lee, Harrison, Rastogi, Abhinav, Wang, Mingqiu, Soltau, Hagen, Shafran, Izhak, Wu, Yonghui
We propose AnyTOD, an end-to-end, zero-shot task-oriented dialog (TOD) system capable of handling unseen tasks without task-specific training. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer as a schema. To enable generalization to unseen schemas and programs without prior training, AnyTOD adopts a neuro-symbolic approach. A neural LM keeps track of events occurring during a conversation and a symbolic program implementing the dialog policy is executed to recommend next actions AnyTOD should take. This approach drastically reduces data annotation and model training requirements, addressing the enduring challenge of rapidly adapting a TOD system to unseen tasks and domains. We demonstrate state-of-the-art results on STAR, ABCD and SGD benchmarks. We also demonstrate strong zero-shot transfer ability in low-resource settings, such as zero-shot on MultiWOZ. In addition, we release STARv2, an updated version of the STAR dataset with richer annotations, for benchmarking zero-shot end-to-end TOD models.
Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots
Fu, Haomin, Zhang, Yeqin, Yu, Haiyang, Sun, Jian, Huang, Fei, Si, Luo, Li, Yongbin, Nguyen, Cam-Tu
This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations. This is of particular interest for companies and organizations that own a large number of manuals or instruction books. Despite its potential, the nature of our task poses several challenges: (1) documents contain various structures that hinder the ability of machines to comprehend, and (2) user information needs are often underspecified. Compared to prior datasets that either focus on a single structural type or overlook the role of questioning to uncover user needs, the Doc2Bot dataset is developed to target such challenges systematically. Our dataset contains over 100,000 turns based on Chinese documents from five domains, larger than any prior document-grounded dialog dataset for information seeking. We propose three tasks in Doc2Bot: (1) dialog state tracking to track user intentions, (2) dialog policy learning to plan system actions and contents, and (3) response generation which generates responses based on the outputs of the dialog policy. Baseline methods based on the latest deep learning models are presented, indicating that our proposed tasks are challenging and worthy of further research.
Learning Interpretable Latent Dialogue Actions With Less Supervision
Hudeček, Vojtěch, Dušek, Ondřej
We present a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions. Our model is based on variational recurrent neural networks (VRNN) and requires no explicit annotation of semantic information. Unlike previous works, our approach models the system and user turns separately and performs database query modeling, which makes the model applicable to task-oriented dialogues while producing easily interpretable action latent variables. We show that our model outperforms previous approaches with less supervision in terms of perplexity and BLEU on three datasets, and we propose a way to measure dialogue success without the need for expert annotation. Finally, we propose a novel way to explain semantics of the latent variables with respect to system actions.
GenTUS: Simulating User Behaviour and Language in Task-oriented Dialogues with Generative Transformers
Lin, Hsien-Chin, Geishauser, Christian, Feng, Shutong, Lubis, Nurul, van Niekerk, Carel, Heck, Michael, Gašić, Milica
User simulators (USs) are commonly used to train task-oriented dialogue systems (DSs) via reinforcement learning. The interactions often take place on semantic level for efficiency, but there is still a gap from semantic actions to natural language, which causes a mismatch between training and deployment environment. Incorporating a natural language generation (NLG) module with USs during training can partly deal with this problem. However, since the policy and NLG of USs are optimised separately, these simulated user utterances may not be natural enough in a given context. In this work, we propose a generative transformer-based user simulator (GenTUS). GenTUS consists of an encoder-decoder structure, which means it can optimise both the user policy and natural language generation jointly. GenTUS generates both semantic actions and natural language utterances, preserving interpretability and enhancing language variation. In addition, by representing the inputs and outputs as word sequences and by using a large pre-trained language model we can achieve generalisability in feature representation. We evaluate GenTUS with automatic metrics and human evaluation. Our results show that GenTUS generates more natural language and is able to transfer to an unseen ontology in a zero-shot fashion. In addition, its behaviour can be further shaped with reinforcement learning opening the door to training specialised user simulators.