AITopics | Park, Seohyun

Collaborating Authors

Park, Seohyun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Large language models on Understanding Korean indirect Speech acts

Koo, Youngeun, Lee, Jiwoo, Park, Dojun, Park, Seohyun, Lee, Sungeun

arXiv.org Artificial IntelligenceFeb-15-2025

To accurately understand the intention of an utterance is crucial in conversational communication. As conversational artificial intelligence models are rapidly being developed and applied in various fields, it is important to evaluate the LLMs' capabilities of understanding the intentions of user's utterance. This study evaluates whether current LLMs can understand the intention of an utterance by considering the given conversational context, particularly in cases where the actual intention differs from the surface-leveled, literal intention of the sentence, i.e. indirect speech acts. Our findings reveal that Claude3-Opus outperformed the other competing models, with 71.94% in MCQ and 65% in OEQ, showing a clear advantage. In general, proprietary models exhibited relatively higher performance compared to open-source models. Nevertheless, no LLMs reached the level of human performance. Most LLMs, except for Claude3-Opus, demonstrated significantly lower performance in understanding indirect speech acts compared to direct speech acts, where the intention is explicitly revealed through the utterance. This study not only performs an overall pragmatic evaluation of each LLM's language use through the analysis of OEQ response patterns, but also emphasizes the necessity for further research to improve LLMs' understanding of indirect speech acts for more natural communication with humans.

korean indirect speech act, large language model, natural language, (1 more...)

arXiv.org Artificial Intelligence

2502.10995

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models

Park, Dojun, Lee, Jiwoo, Park, Seohyun, Jeong, Hyeyun, Koo, Youngeun, Hwang, Soonha, Park, Seonwoo, Lee, Sungeun

arXiv.org Artificial IntelligenceJun-11-2024

As the capabilities of LLMs expand, it becomes increasingly important to evaluate them beyond basic knowledge assessment, focusing on higher-level language understanding. This study introduces MultiPragEval, a robust test suite designed for the multilingual pragmatic evaluation of LLMs across English, German, Korean, and Chinese. Comprising 1200 question units categorized according to Grice's Cooperative Principle and its four conversational maxims, MultiPragEval enables an in-depth assessment of LLMs' contextual awareness and their ability to infer implied meanings. Our findings demonstrate that Claude3-Opus significantly outperforms other models in all tested languages, establishing a state-of-the-art in the field. Among open-source models, Solar-10.7B and Qwen1.5-14B emerge as strong competitors. This study not only leads the way in the multilingual evaluation of LLMs in pragmatic inference but also provides valuable insights into the nuanced capabilities necessary for advanced language comprehension in AI systems.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.07736

Country:

North America > United States (0.14)
Europe > France (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Enhancing Clinical Efficiency through LLM: Discharge Note Generation for Cardiac Patients

Jung, HyoJe, Kim, Yunha, Choi, Heejung, Seo, Hyeram, Kim, Minkyoung, Han, JiYe, Kee, Gaeun, Park, Seohyun, Ko, Soyoung, Kim, Byeolhee, Kim, Suyeon, Jun, Tae Joon, Kim, Young-Hak

arXiv.org Artificial IntelligenceApr-7-2024

Medical documentation, including discharge notes, is crucial for ensuring patient care quality, continuity, and effective medical communication. However, the manual creation of these documents is not only time-consuming but also prone to inconsistencies and potential errors. The automation of this documentation process using artificial intelligence (AI) represents a promising area of innovation in healthcare. This study directly addresses the inefficiencies and inaccuracies in creating discharge notes manually, particularly for cardiac patients, by employing AI techniques, specifically large language model (LLM). Utilizing a substantial dataset from a cardiology center, encompassing wide-ranging medical records and physician assessments, our research evaluates the capability of LLM to enhance the documentation process. Among the various models assessed, Mistral-7B distinguished itself by accurately generating discharge notes that significantly improve both documentation efficiency and the continuity of care for patients. These notes underwent rigorous qualitative evaluation by medical expert, receiving high marks for their clinical relevance, completeness, readability, and contribution to informed decision-making and care planning. Coupled with quantitative analyses, these results confirm Mistral-7B's efficacy in distilling complex medical information into concise, coherent summaries. Overall, our findings illuminate the considerable promise of specialized LLM, such as Mistral-7B, in refining healthcare documentation workflows and advancing patient care. This study lays the groundwork for further integrating advanced AI technologies in healthcare, demonstrating their potential to revolutionize patient documentation and support better care outcomes.

discharge note, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.05144

Country: Asia > South Korea (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (0.73)
Health & Medicine > Health Care Technology > Medical Record (0.51)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Pragmatic Competence Evaluation of Large Language Models for Korean

Park, Dojun, Lee, Jiwoo, Jeong, Hyeyun, Park, Seohyun, Lee, Sungeun

arXiv.org Artificial IntelligenceMar-19-2024

The current evaluation of Large Language Models (LLMs) predominantly relies on benchmarks focusing on their embedded knowledge by testing through multiple-choice questions (MCQs), a format inherently suited for automated evaluation. Our study extends this evaluation to explore LLMs' pragmatic competence--a facet previously underexamined before the advent of sophisticated LLMs, specifically in the context of Korean. We employ two distinct evaluation setups: the conventional MCQ format, adapted for automatic evaluation, and Open-Ended Questions (OEQs), assessed by human experts, to examine LLMs' narrative response capabilities without predefined options. Our findings reveal that GPT-4 excels, scoring 81.11 and 85.69 in the MCQ and OEQ setups, respectively, with HyperCLOVA X, an LLM optimized for Korean, closely following, especially in the OEQ setup, demonstrating a score of 81.56 with a marginal difference of 4.13 points compared to GPT-4. Furthermore, while few-shot learning strategies generally enhance LLM performance, Chain-of-Thought (CoT) prompting introduces a bias toward literal interpretations, hindering accurate pragmatic inference. Considering the growing expectation for LLMs to understand and produce language that aligns with human communicative norms, our findings emphasize the importance for advancing LLMs' abilities to grasp and convey sophisticated meanings beyond mere literal interpretations.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2403.12675

Country:

Asia > South Korea (0.49)
Europe (0.46)

Genre: Research Report > New Finding (0.86)

Industry:

Education (0.48)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

German Phoneme Recognition with Text-to-Phoneme Data Augmentation

Park, Dojun, Park, Seohyun

arXiv.org Artificial IntelligenceNov-24-2022

In this study, we experimented to examine the effect of adding the most frequent n phoneme bigrams to the basic vocabulary on the German phoneme recognition model using the text-to-phoneme data augmentation strategy. As a result, compared to the baseline model, the vowel30 model and the const20 model showed an increased BLEU score of more than 1 point, and the total30 model showed a significant decrease in the BLEU score of more than 20 points, showing that the phoneme bigrams could have a positive or negative effect on the model performance. In addition, we identified the types of errors that the models repeatedly showed through error analysis.

bleu score, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.13776

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback