AITopics | Kim, Takyoung

Collaborating Authors

Kim, Takyoung

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs

Mukherjee, Sagnik, Chinta, Abhinav, Kim, Takyoung, Sharma, Tarun Anoop, Hakkani-Tür, Dilek

arXiv.org Artificial IntelligenceFeb-12-2025

Chain-of-Thought (CoT) prompting enhances mathematical reasoning in large language models (LLMs) by enabling detailed step-by-step solutions. However, due to the verbosity of LLMs, the resulting reasoning chains can be long, making it harder to verify the reasoning steps and trace issues resulting from dependencies between the steps that may be farther away in the sequence of steps. Importantly, mathematical reasoning allows each step to be derived from a small set of premises, which are a subset of the preceding steps in the reasoning chain. In this paper, we present a framework that identifies the premises for each step, to improve the evaluation of reasoning. We restructure conventional linear reasoning chains into Premise Augmented Reasoning Chains (PARC) by introducing premise links, resulting in a directed acyclic graph where the nodes are the steps and the edges are the premise links. Through experiments with a PARC-based dataset that we built, namely PERL (Premises and ERrors identification in LLMs), we demonstrate that LLMs can reliably identify premises within complex reasoning chains. In particular, even open-source LLMs achieve 90% recall in premise identification. We also show that PARC helps to identify errors in reasoning chains more reliably. The accuracy of error identification improves by 6% to 16% absolute when step-by-step verification is carried out in PARC under the premises. Our findings highlight the utility of premise-centric representations in addressing complex problem-solving tasks and open new avenues for improving the reliability of LLM-based reasoning evaluations.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.02362

Country:

North America > United States (0.14)
North America > Canada (0.14)
Asia > Thailand (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation

Kim, Takyoung, Lee, Kyungjae, Jang, Young Rok, Cho, Ji Yong, Kim, Gangwoo, Cho, Minseok, Lee, Moontae

arXiv.org Artificial IntelligenceJul-1-2024

Interactions with billion-scale large language models typically yield long-form responses due to their extensive parametric capacities, along with retrieval-augmented features. While detailed responses provide insightful viewpoint of a specific subject, they frequently generate redundant and less engaging content that does not meet user interests. In this work, we focus on the role of query outlining (i.e., selected sequence of queries) in scenarios that users request a specific range of information, namely coverage-conditioned ($C^2$) scenarios. For simulating $C^2$ scenarios, we construct QTree, 10K sets of information-seeking queries decomposed with various perspectives on certain topics. By utilizing QTree, we train QPlanner, a 7B language model generating customized query outlines that follow coverage-conditioned queries. We analyze the effectiveness of generated outlines through automatic and human evaluation, targeting on retrieval-augmented generation (RAG). Moreover, the experimental results demonstrate that QPlanner with alignment training can further provide outlines satisfying diverse user interests. Our resources are available at https://github.com/youngerous/qtree.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.01158

Country:

Europe (1.00)
North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training

Lee, Yukyung, Kim, Takyoung, Yoon, Hoonsang, Kang, Pilsung, Bang, Junseong, Kim, Misuk

arXiv.org Artificial IntelligenceJul-23-2023

Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue corpora. In this study, we propose DSTEA, improving Dialogue State Tracking via Entity Adaptive pre-training, which can enhance the encoder through by intensively training key entities in dialogue utterances. DSTEA identifies these pivotal entities from input dialogues utilizing four different methods: ontology information, named-entity recognition, the spaCy, and the flair library. Subsequently, it employs selective knowledge masking to train the model effectively. Remarkably, DSTEA only requires pre-training without the direct infusion of extra knowledge into the DST model. This approach resulted in substantial performance improvements of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further validation of DSTEA's efficacy was provided through comparative experiments considering various entity types and different entity adaptive pre-training configurations such as masking strategy and masking rate.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2207.03858

Country:

Europe (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

Add feedback

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Lee, Hwaran, Hong, Seokhee, Park, Joonsuk, Kim, Takyoung, Kim, Gunhee, Ha, Jung-Woo

arXiv.org Artificial IntelligenceMay-29-2023

Large language models (LLMs) learn not only natural text generation abilities but also social biases against different demographic groups from real-world data. This poses a critical risk when deploying LLM-based applications. Existing research and resources are not readily applicable in South Korea due to the differences in language and culture, both of which significantly affect the biases and targeted demographic groups. This limitation requires localized social bias datasets to ensure the safe and effective deployment of LLMs. To this end, we present KO SB I, a new social bias dataset of 34k pairs of contexts and sentences in Korean covering 72 demographic groups in 15 categories. We find that through filtering-based moderation, social biases in generated content can be reduced by 16.47%p on average for HyperCLOVA (30B and 82B), and GPT-3.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.17701

Country:

Asia > South Korea (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government (0.67)
Law (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

Lee, Hwaran, Hong, Seokhee, Park, Joonsuk, Kim, Takyoung, Cha, Meeyoung, Choi, Yejin, Kim, Byoung Pil, Kim, Gunhee, Lee, Eun-Ju, Lim, Yong, Oh, Alice, Park, Sangchul, Ha, Jung-Woo

arXiv.org Artificial IntelligenceMay-28-2023

The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on coping with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-intentioned. For safer models in such scenarios, we present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines. Experiments show that acceptable response generation significantly improves for HyperCLOVA and GPT-3, demonstrating the efficacy of this dataset.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.17696

Country:

Asia > South Korea (1.00)
Asia > Middle East > UAE (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government (1.00)
Education (1.00)
Media > News (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

Kim, Takyoung, Shin, Jamin, Kim, Young-Ho, Bae, Sanghwan, Kim, Sungdong

arXiv.org Artificial IntelligenceMay-23-2023

Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias. This data bias deepens when it combines with data-driven TOD systems, as it is impossible to fathom the effect of it with existing static evaluations. Hence, we conduct an interactive user study to unveil how vulnerable TOD systems are against realistic scenarios. In particular, we compare users with 1) detailed goal instructions that conform to the system boundaries (closed-goal) and 2) vague goal instructions that are often unsupported but realistic (open-goal). Our study reveals that conversations in open-goal settings lead to catastrophic failures of the system, in which 92% of the dialogues had significant issues. Moreover, we conduct a thorough analysis to identify distinctive features between the two settings through error annotation. From this, we discover a novel "pretending" behavior, in which the system pretends to handle the user requests even though they are beyond the system's capabilities. We discuss its characteristics and toxicity while emphasizing transparency and a fallback strategy for robust TOD systems.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.13857

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.68)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Consumer Products & Services > Restaurants (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback