Lee, Gary Geunbae
Semantic Parsing with Candidate Expressions for Knowledge Base Question Answering
Nam, Daehwan, Lee, Gary Geunbae
Semantic parsers convert natural language to logical forms, which can be evaluated on knowledge bases (KBs) to produce denotations. Recent semantic parsers have been developed with sequence-to-sequence (seq2seq) pre-trained language models (PLMs) or large language models, where the models treat logical forms as sequences of tokens. For syntactic and semantic validity, the semantic parsers use grammars that enable constrained decoding. However, the grammars lack the ability to utilize large information of KBs, although logical forms contain representations of KB elements, such as entities or relations. In this work, we propose a grammar augmented with candidate expressions for semantic parsing on a large KB with a seq2seq PLM. The grammar defines actions as production rules, and our semantic parser predicts actions during inference under the constraints by types and candidate expressions. We apply the grammar to knowledge base question answering, where the constraints by candidate expressions assist a semantic parser to generate valid KB elements. In experiments on two benchmarks, KQA Pro and Overnight, the constraints by candidate expressions increased the accuracy of our semantic parser, whether it was trained with strong supervision or weak supervision. Our semantic parser achieved state-of-the-art accuracies on KQA Pro and Overnight, and its implementation is publicly available at https://github.com/daehwannam/candexpr-sp.git.
Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages
Hwang, Seonjeong, Kim, Yunsu, Lee, Gary Geunbae
Automatic question generation (QG) serves a wide range of purposes, such as augmenting question-answering (QA) corpora, enhancing chatbot systems, and developing educational materials. Despite its importance, most existing datasets predominantly focus on English, resulting in a considerable gap in data availability for other languages. Cross-lingual transfer for QG (XLT-QG) addresses this limitation by allowing models trained on high-resource language datasets to generate questions in low-resource languages. In this paper, we propose a simple and efficient XLT-QG method that operates without the need for monolingual, parallel, or labeled data in the target language, utilizing a small language model. Our model, trained solely on English QA datasets, learns interrogative structures from a limited set of question exemplars, which are then applied to generate questions in the target language. Experimental results show that our method outperforms several XLT-QG baselines and achieves performance comparable to GPT-3.5-turbo across different languages. Additionally, the synthetic data generated by our model proves beneficial for training multilingual QA models. With significantly fewer parameters than large language models and without requiring additional training for target languages, our approach offers an effective solution for QG and QA tasks across various languages.
Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing
Kang, Deokhyung, Hwang, Seonjeong, Kim, Yunsu, Lee, Gary Geunbae
Recent efforts have aimed to utilize multilingual pretrained language models (mPLMs) to extend semantic parsing (SP) across multiple languages without requiring extensive annotations. However, achieving zero-shot cross-lingual transfer for SP remains challenging, leading to a performance gap between source and target languages. In this study, we propose Cross-Lingual Back-Parsing (CBP), a novel data augmentation methodology designed to enhance cross-lingual transfer for SP. Leveraging the representation geometry of the mPLMs, CBP synthesizes target language utterances from source meaning representations. Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings, by utilizing only labeled data in the source language and monolingual corpora. Extensive experiments on two cross-language SP benchmarks (Mschema2QA and Xspider) demonstrate that CBP brings substantial gains in the target language. Further analysis of the synthesized utterances shows that our method successfully generates target language utterances with high slot value alignment rates while preserving semantic integrity. Our codes and data are publicly available at https://github.com/deokhk/CBP.
Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards
Do, Heejin, Ryu, Sangwon, Lee, Gary Geunbae
Recent advances in automated essay scoring (AES) have shifted towards evaluating multiple traits to provide enriched feedback. Like typical AES systems, multi-trait AES employs the quadratic weighted kappa (QWK) to measure agreement with human raters, aligning closely with the rating schema; however, its non-differentiable nature prevents its direct use in neural network training. In this paper, we propose Scoring-aware Multi-reward Reinforcement Learning (SaMRL), which integrates actual evaluation schemes into the training process by designing QWK-based rewards with a mean-squared error penalty for multi-trait AES. Existing reinforcement learning (RL) applications in AES are limited to classification models despite associated performance degradation, as RL requires probability distributions; instead, we adopt an autoregressive score generation framework to leverage token generation probabilities for robust multi-trait score predictions. Empirical analyses demonstrate that SaMRL facilitates model training, notably enhancing scoring of previously inferior prompts.
Key-Element-Informed sLLM Tuning for Document Summarization
Ryu, Sangwon, Do, Heejin, Kim, Yunsu, Lee, Gary Geunbae, Ok, Jungseul
Remarkable advances in large language models (LLMs) have enabled high-quality text summarization. However, this capability is currently accessible only through LLMs of substantial size or proprietary LLMs with usage fees. In response, smaller-scale LLMs (sLLMs) of easy accessibility and low costs have been extensively studied, yet they often suffer from missing key information and entities, i.e., low relevance, in particular, when input documents are long. We hence propose a key-element-informed instruction tuning for summarization, so-called KEITSum, which identifies key elements in documents and instructs sLLM to generate summaries capturing these key elements. Experimental results on dialogue and news datasets demonstrate that sLLM with KEITSum indeed provides high-quality summarization with higher relevance and less hallucinations, competitive to proprietary LLM.
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
Ryu, Sangwon, Do, Heejin, Kim, Yunsu, Lee, Gary Geunbae, Ok, Jungseul
The evaluation of summary quality encompasses diverse dimensions such as consistency, coherence, relevance, and fluency. However, existing summarization methods often target a specific dimension, facing challenges in generating well-balanced summaries across multiple dimensions. In this paper, we propose multi-objective reinforcement learning tailored to generate balanced summaries across all four dimensions. We introduce two multi-dimensional optimization (MDO) strategies for adaptive learning: 1) MDO_min, rewarding the current lowest dimension score, and 2) MDO_pro, optimizing multiple dimensions similar to multi-task learning, resolves conflicting gradients across dimensions through gradient projection. Unlike prior ROUGE-based rewards relying on reference summaries, we use a QA-based reward model that aligns with human preferences. Further, we discover the capability to regulate the length of summaries by adjusting the discount factor, seeking the generation of concise yet informative summaries that encapsulate crucial points. Our approach achieved substantial performance gains compared to baseline models on representative summarization datasets, particularly in the overlooked dimensions.
Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents
Kim, San, Lee, Gary Geunbae
Recent advancements in open-domain dialogue systems have been propelled by the emergence of high-quality large language models (LLMs) and various effective training methodologies. Nevertheless, the presence of toxicity within these models presents a significant challenge that can potentially diminish the user experience. In this study, we introduce an innovative training algorithm, an improvement upon direct preference optimization (DPO), called adversarial DPO (ADPO). The ADPO algorithm is designed to train models to assign higher probability distributions to preferred responses and lower distributions to unsafe responses, which are self-generated using the toxic control token. We demonstrate that ADPO enhances the model's resilience against harmful conversations while minimizing performance degradation. Furthermore, we illustrate that ADPO offers a more stable training procedure compared to the traditional DPO. To the best of our knowledge, this is the first adaptation of the DPO algorithm that directly incorporates harmful data into the generative model, thereby reducing the need to artificially create safe dialogue data.
Leveraging the Interplay Between Syntactic and Acoustic Cues for Optimizing Korean TTS Pause Formation
Jeon, Yejin, Kim, Yunsu, Lee, Gary Geunbae
Contemporary neural speech synthesis models have indeed demonstrated remarkable proficiency in synthetic speech generation as they have attained a level of quality comparable to that of human-produced speech. Nevertheless, it is important to note that these achievements have predominantly been verified within the context of high-resource languages such as English. Furthermore, the Tacotron and FastSpeech variants show substantial pausing errors when applied to the Korean language, which affects speech perception and naturalness. In order to address the aforementioned issues, we propose a novel framework that incorporates comprehensive modeling of both syntactic and acoustic cues that are associated with pausing patterns. Remarkably, our framework possesses the capability to consistently generate natural speech even for considerably more extended and intricate out-of-domain (OOD) sentences, despite its training on short audio clips. Architectural design choices are validated through comparisons with baseline models and ablation studies using subjective and objective metrics, thus confirming model performance.
Explainable Multi-hop Question Generation: An End-to-End Approach without Intermediate Question Labeling
Hwang, Seonjeong, Kim, Yunsu, Lee, Gary Geunbae
In response to the increasing use of interactive artificial intelligence, the demand for the capacity to handle complex questions has increased. Multi-hop question generation aims to generate complex questions that requires multi-step reasoning over several documents. Previous studies have predominantly utilized end-to-end models, wherein questions are decoded based on the representation of context documents. However, these approaches lack the ability to explain the reasoning process behind the generated multi-hop questions. Additionally, the question rewriting approach, which incrementally increases the question complexity, also has limitations due to the requirement of labeling data for intermediate-stage questions. In this paper, we introduce an end-to-end question rewriting model that increases question complexity through sequential rewriting. The proposed model has the advantage of training with only the final multi-hop questions, without intermediate questions. Experimental results demonstrate the effectiveness of our model in generating complex questions, particularly 3- and 4-hop questions, which are appropriately paired with input answers. We also prove that our model logically and incrementally increases the complexity of questions, and the generated multi-hop questions are also beneficial for training question answering models.
Denoising Table-Text Retrieval for Open-Domain Question Answering
Kang, Deokhyung, Jung, Baikjin, Kim, Yunsu, Lee, Gary Geunbae
In table-text open-domain question answering, a retriever system retrieves relevant evidence from tables and text to answer questions. Previous studies in table-text open-domain question answering have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets; secondly, they may struggle to provide appropriate evidence for questions that require reasoning across the table. To address these issues, we propose Denoised Table-Text Retriever (DoTTeR). Our approach involves utilizing a denoised training dataset with fewer false positive labels by discarding instances with lower question-relevance scores measured through a false positive detection model. Subsequently, we integrate table-level ranking information into the retriever to assist in finding evidence for questions that demand reasoning across the table. To encode this ranking information, we fine-tune a rank-aware column encoder to identify minimum and maximum values within a column. Experimental results demonstrate that DoTTeR significantly outperforms strong baselines on both retrieval recall and downstream QA tasks. Our code is available at https://github.com/deokhk/DoTTeR.