AITopics | Lee, Hanwool

Plotting

Lee, Hanwool

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

(G)I-DLE: Generative Inference via Distribution-preserving Logit Exclusion with KL Divergence Minimization for Constrained Decoding

Lee, Hanwool

arXiv.org Artificial IntelligenceMar-23-2025

We propose (G)I-DLE, a new approach to constrained decoding that leverages KL divergence minimization to preserve the intrinsic conditional probability distribution of autoregressive language models while excluding undesirable tokens. Unlike conventional methods that naively set banned tokens' logits to $-\infty$, which can distort the conversion from raw logits to posterior probabilities and increase output variance, (G)I-DLE re-normalizes the allowed token probabilities to minimize such distortion. We validate our method on the K2-Eval dataset, specifically designed to assess Korean language fluency, logical reasoning, and cultural appropriateness. Experimental results on Qwen2.5 models (ranging from 1.5B to 14B) demonstrate that G-IDLE not only boosts mean evaluation scores but also substantially reduces the variance of output quality.

large language model, natural language, qwen2, (16 more...)

arXiv.org Artificial Intelligence

2503.1805

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.37)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

TWICE: What Advantages Can Low-Resource Domain-Specific Embedding Model Bring? - A Case Study on Korea Financial Texts

Hwang, Yewon, Jung, Sungbum, Lee, Hanwool, Yu, Sara

arXiv.org Artificial IntelligenceFeb-10-2025

Domain specificity of embedding models is critical for effective performance. However, existing benchmarks, such as FinMTEB, are primarily designed for high-resource languages, leaving low-resource settings, such as Korean, under-explored. Directly translating established English benchmarks often fails to capture the linguistic and cultural nuances present in low-resource domains. In this paper, titled TWICE: What Advantages Can Low-Resource Domain-Specific Embedding Models Bring? A Case Study on Korea Financial Texts, we introduce KorFinMTEB, a novel benchmark for the Korean financial domain, specifically tailored to reflect its unique cultural characteristics in low-resource languages. Our experimental results reveal that while the models perform robustly on a translated version of FinMTEB, their performance on KorFinMTEB uncovers subtle yet critical discrepancies, especially in tasks requiring deeper semantic understanding, that underscore the limitations of direct translation. This discrepancy highlights the necessity of benchmarks that incorporate language-specific idiosyncrasies and cultural nuances. The insights from our study advocate for the development of domain-specific evaluation frameworks that can more accurately assess and drive the progress of embedding models in low-resource settings.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.07131

Genre: Research Report (0.85)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ML-Promise: A Multilingual Dataset for Corporate Promise Verification

Seki, Yohei, Shu, Hakusen, Lhuissier, Anaïs, Lee, Hanwool, Kang, Juyeon, Day, Min-Yuh, Chen, Chung-Chi

arXiv.org Artificial IntelligenceNov-7-2024

Promises made by politicians, corporate leaders, and public figures have a significant impact on public perception, trust, and institutional reputation. However, the complexity and volume of such commitments, coupled with difficulties in verifying their fulfillment, necessitate innovative methods for assessing their credibility. This paper introduces the concept of Promise Verification, a systematic approach involving steps such as promise identification, evidence assessment, and the evaluation of timing for verification. We propose the first multilingual dataset, ML-Promise, which includes English, French, Chinese, Japanese, and Korean, aimed at facilitating in-depth verification of promises, particularly in the context of Environmental, Social, and Governance (ESG) reports. Given the growing emphasis on corporate environmental contributions, this dataset addresses the challenge of evaluating corporate promises, especially in light of practices like greenwashing. Our findings also explore textual and image-based baselines, with promising results from retrieval-augmented generation (RAG) approaches. This work aims to foster further discourse on the accountability of public commitments across multiple languages and domains.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.04473

Country:

Asia > Japan (0.16)
Asia > Taiwan (0.15)
North America > United States (0.15)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Social Sector (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

KMMLU: Measuring Massive Multitask Language Understanding in Korean

Son, Guijin, Lee, Hanwool, Kim, Sungdong, Kim, Seungone, Muennighoff, Niklas, Choi, Taekyoon, Park, Cheonbok, Yoo, Kang Min, Biderman, Stella

arXiv.org Artificial IntelligenceJun-6-2024

We propose KMMLU, a new Korean benchmark with 35,030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM. While prior Korean benchmarks are translated from existing English benchmarks, KMMLU is collected from original Korean exams, capturing linguistic and cultural aspects of the Korean language. We test 27 public and proprietary LLMs and observe the best public model to score 50.5%, leaving significant room for improvement. This model was primarily trained for English and Chinese, not Korean. Current LLMs tailored to Korean, such as Polyglot-Ko, perform far worse. Surprisingly, even the most capable proprietary LLMs, e.g., GPT-4 and HyperCLOVA X do not exceed 60%. This suggests that further work is needed to improve LLMs for Korean, and we believe KMMLU offers the appropriate tool to track this progress. We make our dataset publicly available on the Hugging Face Hub and integrate the benchmark into EleutherAI's Language Model Evaluation Harness.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.11548

Country: Asia (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

Son, Guijin, Lee, Hanwool, Kim, Suwan, Kim, Huiseo, Lee, Jaecheol, Yeom, Je Won, Jung, Jihyu, Kim, Jung Woo, Kim, Songseong

arXiv.org Artificial IntelligenceSep-23-2023

Large Language Models (LLMs) trained on massive corpora demonstrate impressive capabilities in a wide range of tasks. While there are ongoing efforts to adapt these models to languages beyond English, the attention given to their evaluation methodologies remains limited. Current multilingual benchmarks often rely on back translations or re-implementations of English tests, limiting their capacity to capture unique cultural and linguistic nuances. To bridge this gap for the Korean language, we introduce HAE-RAE Bench, a dataset curated to challenge models lacking Korean cultural and contextual depth. The dataset encompasses six downstream tasks across four domains: vocabulary, history, general knowledge, and reading comprehension. Contrary to traditional evaluation suites focused on token or sequence classification and specific mathematical or logical reasoning, HAE-RAE Bench emphasizes a model's aptitude for recalling Korean-specific knowledge and cultural contexts. Comparative analysis with prior Korean benchmarks indicates that the HAE-RAE Bench presents a greater challenge to non-native models, by disturbing abilities and knowledge learned from English being transferred.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2309.02706

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EaSyGuide : ESG Issue Identification Framework leveraging Abilities of Generative Large Language Models

Lee, Hanwool, Choi, Jonghyun, Kwon, Sohyeon, Jung, Sungbum

arXiv.org Artificial IntelligenceJun-13-2023

This paper presents our participation in the FinNLP-2023 shared task on multi-lingual environmental, social, and corporate governance issue identification (ML-ESG). The task's objective is to classify news articles based on the 35 ESG key issues defined by the MSCI ESG rating guidelines. Our approach focuses on the English and French subtasks, employing the CerebrasGPT, OPT, and Pythia models, along with the zero-shot and GPT3Mix Augmentation techniques. We utilize various encoder models, such as RoBERTa, DeBERTa, and FinBERT, subjecting them to knowledge distillation and additional training. Our approach yielded exceptional results, securing the first position in the English text subtask with F1-score 0.69 and the second position in the French text subtask with F1-score 0.78. These outcomes underscore the effectiveness of our methodology in identifying ESG issues in news articles across different languages. Our findings contribute to the exploration of ESG topics and highlight the potential of leveraging advanced language models for ESG issue identification.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.06662

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Law (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance

Son, Guijin, Lee, Hanwool, Kang, Nahyeon, Hahm, Moonjeong

arXiv.org Artificial IntelligenceJan-24-2023

Extraction of sentiment signals from news text, stock message boards, and business reports, for stock movement prediction, has been a rising field of interest in finance. Building upon past literature, the most recent works attempt to better capture sentiment from sentences with complex syntactic structures by introducing aspect-level sentiment classification (ASC). Despite the growing interest, however, fine-grained sentiment analysis has not been fully explored in non-English literature due to the shortage of annotated finance-specific data. Accordingly, it is necessary for non-English languages to leverage datasets and pre-trained language models (PLM) of different domains, languages, and tasks to best their performance. To facilitate finance-specific ASC research in the Korean language, we build KorFinASC, a Korean aspect-level sentiment classification dataset for finance consisting of 12,613 human-annotated samples, and explore methods of intermediate transfer learning. Our experiments indicate that past research has been ignorant towards the potentially wrong knowledge of financial entities encoded during the training phase, which has overestimated the predictive power of PLMs. In our work, we use the term "non-stationary knowledge'' to refer to information that was previously correct but is likely to change, and present "TGT-Masking'', a novel masking pattern to restrict PLMs from speculating knowledge of the kind. Finally, through a series of transfer learning with TGT-Masking applied we improve 22.63% of classification accuracy compared to standalone models on KorFinASC.

artificial intelligence, natural language, text classification, (16 more...)

arXiv.org Artificial Intelligence

2301.03136

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.91)

Add feedback