AITopics | Lim, Woohyung

Collaborating Authors

Lim, Woohyung

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

Research, LG AI, An, Soyoung, Bae, Kyunghoon, Choi, Eunbi, Choi, Kibong, Choi, Stanley Jungkyu, Hong, Seokhee, Hwang, Junwon, Jeon, Hyojin, Jo, Gerrard Jeongwon, Jo, Hyunjik, Jung, Jiyeon, Jung, Yountae, Kim, Hyosang, Kim, Joonkee, Kim, Seonghwan, Kim, Soyeon, Kim, Sunkyoung, Kim, Yireun, Kim, Yongil, Kim, Youchul, Lee, Edward Hwayoung, Lee, Haeju, Lee, Honglak, Lee, Jinsik, Lee, Kyungmin, Lim, Woohyung, Park, Sangha, Park, Sooyoun, Park, Yongmin, Yang, Sihoon, Yeen, Heuiyeen, Yun, Hyeongu

arXiv.org Artificial IntelligenceDec-9-2024

This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research. The EXAONE 3.5 language models are offered in three configurations: 32B, 7.8B, and 2.4B. These models feature several standout capabilities: 1) exceptional instruction following capabilities in real-world scenarios, achieving the highest scores across seven benchmarks, 2) outstanding long-context comprehension, attaining the top performance in four benchmarks, and 3) competitive results compared to state-of-the-art open models of similar sizes across nine general benchmarks. The EXAONE 3.5 language models are open to anyone for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE. For commercial use, please reach out to the official contact point of LG AI Research: contact_us@lgresearch.ai.

exaone 3, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.04862

Country:

Europe > France (0.14)
North America > United States (0.14)
Asia > Middle East (0.14)
(4 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

Lee, Kyungeun, Sim, Ye Seul, Cho, Hye-Seung, Eo, Moonjung, Yoon, Suhee, Yoon, Sanghyu, Lim, Woohyung

arXiv.org Artificial IntelligenceMay-13-2024

The ability of deep networks to learn superior representations hinges on leveraging the proper inductive biases, considering the inherent properties of datasets. In tabular domains, it is critical to effectively handle heterogeneous features (both categorical and numerical) in a unified manner and to grasp irregular functions like piecewise constant functions. To address the challenges in the self-supervised learning framework, we propose a novel pretext task based on the classical binning method. The idea is straightforward: reconstructing the bin indices (either orders or classes) rather than the original values. This pretext task provides the encoder with an inductive bias to capture the irregular dependencies, mapping from continuous inputs to discretized bins, and mitigates the feature heterogeneity by setting all features to have category-type targets. Our empirical investigations ascertain several advantages of binning: capturing the irregular function, compatibility with encoder architecture and additional modifications, standardizing all features into equal sets, grouping similar values within a feature, and providing ordering information. Comprehensive evaluations across diverse tabular datasets corroborate that our method consistently improves tabular representation learning performance for a wide range of downstream tasks. The codes are available in https://github.com/kyungeun-lee/tabularbinning.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.07414

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.46)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can We Utilize Pre-trained Language Models within Causal Discovery Algorithms?

Lee, Chanhui, Kim, Juhyeon, Jeong, Yongjun, Lyu, Juhyun, Kim, Junghee, Lee, Sangmin, Han, Sangjun, Choe, Hyeokjun, Park, Soyeon, Lim, Woohyung, Lim, Sungbin, Lee, Sanghack

arXiv.org Artificial IntelligenceNov-18-2023

Scaling laws have allowed Pre-trained Language Models (PLMs) into the field of causal reasoning. Causal reasoning of PLM relies solely on text-based descriptions, in contrast to causal discovery which aims to determine the causal relationships between variables utilizing data. Recently, there has been current research regarding a method that mimics causal discovery by aggregating the outcomes of repetitive causal reasoning, achieved through specifically designed prompts. It highlights the usefulness of PLMs in discovering cause and effect, which is often limited by a lack of data, especially when dealing with multiple variables. Conversely, the characteristics of PLMs which are that PLMs do not analyze data and they are highly dependent on prompt design leads to a crucial limitation for directly using PLMs in causal discovery. Accordingly, PLM-based causal reasoning deeply depends on the prompt design and carries out the risk of overconfidence and false predictions in determining causal relationships. In this paper, we empirically demonstrate the aforementioned limitations of PLM-based causal reasoning through experiments on physics-inspired synthetic data. Then, we propose a new framework that integrates prior knowledge obtained from PLM with a causal discovery algorithm. This is accomplished by initializing an adjacency matrix for causal discovery and incorporating regularization using prior knowledge. Our proposed framework not only demonstrates improved performance through the integration of PLM and causal discovery but also suggests how to leverage PLM-extracted prior knowledge with existing causal discovery algorithms.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2311.11212

Country: North America (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

Geometrically Aligned Transfer Encoder for Inductive Transfer in Regression Tasks

Ko, Sung Moon, Lee, Sumin, Jeong, Dae-Woong, Lim, Woohyung, Han, Sehui

arXiv.org Artificial IntelligenceOct-10-2023

Transfer learning is a crucial technique for handling a small amount of data that is potentially related to other abundant data. However, most of the existing methods are focused on classification tasks using images and language datasets. Therefore, in order to expand the transfer learning scheme to regression tasks, we propose a novel transfer technique based on differential geometry, namely the Geometrically Aligned Transfer Encoder (GATE). In this method, we interpret the latent vectors from the model to exist on a Riemannian curved manifold. We find a proper diffeomorphism between pairs of tasks to ensure that every arbitrary point maps to a locally flat coordinate in the overlapping region, allowing the transfer of knowledge from the source to the target data. This also serves as an effective regularizer for the model to behave in extrapolation regions. In this article, we demonstrate that GATE outperforms conventional methods and exhibits stable behavior in both the latent space and extrapolation regions for various molecular graph datasets.

artificial intelligence, machine learning, manifold, (20 more...)

arXiv.org Artificial Intelligence

2310.06369

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Gradient Surgery for One-shot Unlearning on Generative Model

Bae, Seohui, Kim, Seoyoon, Jung, Hyemin, Lim, Woohyung

arXiv.org Artificial IntelligenceJul-18-2023

Recent regulation on right-to-be-forgotten emerges tons of interest in unlearning pre-trained machine learning models. While approximating a straightforward yet expensive approach of retrain-from-scratch, recent machine unlearning methods unlearn a sample by updating weights to remove its influence on the weight parameters. In this paper, we introduce a simple yet effective approach to remove a data influence on the deep generative model. Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples by projecting gradients onto the normal plane of the gradients to be retained. Our work is agnostic to statistics of the removal samples, outperforming existing baselines while providing theoretical analysis for the first time in unlearning a generative model.

artificial intelligence, generative model, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2307.0455

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback