AITopics | Lee, Yoonsang

Collaborating Authors

Lee, Yoonsang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nonlinear Bayesian Update via Ensemble Kernel Regression with Clustering and Subsampling

Lee, Yoonsang

arXiv.org Machine LearningMar-19-2025

Nonlinear Bayesian update for a prior ensemble is proposed to extend traditional ensemble Kalman filtering to settings characterized by non-Gaussian priors and nonlinear measurement operators. In this framework, the observed component is first denoised via a standard Kalman update, while the unobserved component is estimated using a nonlinear regression approach based on kernel density estimation. The method incorporates a subsampling strategy to ensure stability and, when necessary, employs unsupervised clustering to refine the conditional estimate. Numerical experiments on Lorenz systems and a PDE-constrained inverse problem illustrate that the proposed nonlinear update can reduce estimation errors compared to standard linear updates, especially in highly nonlinear scenarios.

artificial intelligence, ensemble, machine learning, (14 more...)

arXiv.org Machine Learning

2503.1516

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.90)

Add feedback

Entropy stable conservative flux form neural networks

Liu, Lizuo, Li, Tongtong, Gelb, Anne, Lee, Yoonsang

arXiv.org Artificial IntelligenceNov-3-2024

We propose an entropy-stable conservative flux form neural network (CFN) that integrates classical numerical conservation laws into a data-driven framework using the entropy-stable, second-order, and non-oscillatory Kurganov-Tadmor (KT) scheme. The proposed entropy-stable CFN uses slope limiting as a denoising mechanism, ensuring accurate predictions in both noisy and sparse observation environments, as well as in both smooth and discontinuous regions. Numerical experiments demonstrate that the entropy-stable CFN achieves both stability and conservation while maintaining accuracy over extended time domains. Furthermore, it successfully predicts shock propagation speeds in long-term simulations, {\it without} oracle knowledge of later-time profiles in the training data.

artificial intelligence, equation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.01746

Country: North America > United States > Maryland > Baltimore County (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.92)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.92)
Energy > Oil & Gas > Midstream (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RARe: Retrieval Augmented Retrieval with In-Context Examples

Tejaswi, Atula, Lee, Yoonsang, Sanghavi, Sujay, Choi, Eunsol

arXiv.org Artificial IntelligenceOct-26-2024

We investigate whether in-context examples, widely used in decoder-only language models (LLMs), can improve embedding model performance in retrieval tasks. Unlike in LLMs, naively prepending in-context examples (query-document pairs) to the target query at inference time does not work out of the box. We introduce a simple approach to enable retrievers to use in-context examples. Our approach, RARe, finetunes a pre-trained model with in-context examples whose query is semantically similar to the target query. This can be applied to adapt various base architectures (i.e., decoder-only language models, retriever models) and consistently achieves performance gains of up to +2.72% nDCG across various open-domain retrieval datasets (BeIR, RAR-b). In particular, we find RARe exhibits stronger out-of-domain generalization compared to models using queries without in-context examples, similar to what is seen for in-context learning in LLMs. We further provide analysis on the design choices of in-context example augmentation and lay the foundation for future work in this space. In-context learning (ICL) (Brown et al., 2020) has emerged as a powerful paradigm enabling diverse applications without parameter updates in large language models (LLMs). By conditioning on inputoutput examples that demonstrate a specific task, LLMs can generate predictions while maintaining fixed parameters. While in-context learning has been extensively studied for LLMs (Xu et al., 2023; Min et al., 2022a; Dong et al., 2024), its potential for retriever models remains unexplored.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.20088

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.93)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Disentangling Questions from Query Generation for Task-Adaptive Retrieval

Lee, Yoonsang, Kim, Minsoo, Hwang, Seung-won

arXiv.org Artificial IntelligenceSep-24-2024

This paper studies the problem of information retrieval, to adapt to unseen tasks. Existing work generates synthetic queries from domain-specific documents to jointly train the retriever. However, the conventional query generator assumes the query as a question, thus failing to accommodate general search intents. A more lenient approach incorporates task-adaptive elements, such as few-shot learning with an 137B LLM. In this paper, we challenge a trend equating query and question, and instead conceptualize query generation task as a "compilation" of high-level intent into task-adaptive query. Specifically, we propose EGG, a query generator that better adapts to wide search intents expressed in the BeIR benchmark. Our method outperforms baselines and existing models on four tasks with underexplored intents, while utilizing a query generator 47 times smaller than the previous state-of-the-art. Our findings reveal that instructing the LM with explicit search intent is a key aspect of modeling an effective query generator.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.1657

Country: North America > United States > North Carolina (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Food & Agriculture > Agriculture (0.94)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

AmbigDocs: Reasoning across Documents on Different Entities under the Same Name

Lee, Yoonsang, Ye, Xi, Choi, Eunsol

arXiv.org Artificial IntelligenceMay-26-2024

Different entities with the same name can be difficult to distinguish. Handling confusing entity mentions is a crucial skill for language models (LMs). For example, given the question "Where was Michael Jordan educated?" and a set of documents discussing different people named Michael Jordan, can LMs distinguish entity mentions to generate a cohesive answer to the question? To test this ability, we introduce a new benchmark, AmbigDocs. By leveraging Wikipedia's disambiguation pages, we identify a set of documents, belonging to different entities who share an ambiguous name. From these documents, we generate questions containing an ambiguous name and their corresponding sets of answers. Our analysis reveals that current state-of-the-art models often yield ambiguous answers or incorrectly merge information belonging to different entities. We establish an ontology categorizing four types of incomplete answers and automatic evaluation metrics to identify such categories. We lay the foundation for future work on reasoning across multiple documents with ambiguous entities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.12447

Country: North America > United States > Rhode Island (0.28)

Genre: Research Report (0.84)

Industry:

Leisure & Entertainment (0.86)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Stochastic approach for elliptic problems in perforated domains

Han, Jihun, Lee, Yoonsang

arXiv.org Artificial IntelligenceMar-17-2024

A wide range of applications in science and engineering involve a PDE model in a domain with perforations, such as perforated metals or air filters. Solving such perforated domain problems suffers from computational challenges related to resolving the scale imposed by the geometries of perforations. We propose a neural network-based mesh-free approach for perforated domain problems. The method is robust and efficient in capturing various configuration scales, including the averaged macroscopic behavior of the solution that involves a multiscale nature induced by small perforations. The new approach incorporates the derivative-free loss method that uses a stochastic representation or the Feynman-Kac formulation. In particular, we implement the Neumann boundary condition for the derivative-free loss method to handle the interface between the domain and perforations. A suite of stringent numerical tests is provided to support the proposed method's efficacy in handling various perforation scales.

artificial intelligence, machine learning, perforation, (17 more...)

arXiv.org Artificial Intelligence

2403.11385

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Energy (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adaptive Tracking of a Single-Rigid-Body Character in Various Environments

Kwon, Taesoo, Gu, Taehong, Ahn, Jaewon, Lee, Yoonsang

arXiv.org Artificial IntelligenceJan-28-2024

Since the introduction of DeepMimic [Peng et al. 2018], subsequent research has focused on expanding the repertoire of simulated motions across various scenarios. In this study, we propose an alternative approach for this goal, a deep reinforcement learning method based on the simulation of a single-rigid-body character. Using the centroidal dynamics model (CDM) to express the full-body character as a single rigid body (SRB) and training a policy to track a reference motion, we can obtain a policy that is capable of adapting to various unobserved environmental changes and controller transitions without requiring any additional learning. Due to the reduced dimension of state and action space, the learning process is sample-efficient. The final full-body motion is kinematically generated in a physically plausible way, based on the state of the simulated SRB character. The SRB simulation is formulated as a quadratic programming (QP) problem, and the policy outputs an action that allows the SRB character to follow the reference motion. We demonstrate that our policy, efficiently trained within 30 minutes on an ultraportable laptop, has the ability to cope with environments that have not been experienced during learning, such as running on uneven terrain or pushing a box, and transitions between learned policies, without any additional learning.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3610548.3618187

2308.07491

Country: Asia > South Korea (0.15)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Crafting In-context Examples according to LMs' Parametric Knowledge

Lee, Yoonsang, Atreya, Pranav, Ye, Xi, Choi, Eunsol

arXiv.org Artificial IntelligenceNov-16-2023

In-context learning has been applied to knowledge-rich tasks such as question answering. In such scenarios, in-context examples are used to trigger a behaviour in the language model: namely, it should surface information stored in its parametric knowledge. We study the construction of in-context example sets, with a focus on the parametric knowledge of the model regarding in-context examples. We identify 'known' examples, where models can correctly answer from its parametric knowledge, and 'unknown' ones. Our experiments show that prompting with 'unknown' examples decreases the performance, potentially as it encourages hallucination rather than searching its parametric knowledge. Constructing an in-context example set that presents both known and unknown information performs the best across diverse settings. We perform analysis on three multi-answer question answering datasets, which allows us to further study answer set ordering strategies based on the LM's knowledge about each answer. Together, our study sheds lights on how to best construct in-context example sets for knowledge-rich tasks.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2311.09579

Country:

Africa (0.68)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.93)
Government > Regional Government (0.93)
Education (0.68)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.75)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)

Add feedback

Learning In-between Imagery Dynamics via Physical Latent Spaces

Han, Jihun, Lee, Yoonsang, Gelb, Anne

arXiv.org Machine LearningOct-14-2023

Understanding image dynamics from a set of complex measurement data is important in many applications, from the diagnosis or monitoring of a disease done by analyzing a series of medical (e.g. MRI or ultrasound) images, [28], to the interpretation of a sequence of satellite images used to study climate changes, natural disaster, or environmental conditions [2]. Here an "image" refers to a high-dimensional data frame that contains complex and condensed information within each pixel where these pixels are also spatially correlated. To understand the underlying dynamics between sequential images, therefore, it is essential to simultaneously decipher the intertwined relationship among their spatial and temporal features. A common approach for understanding such spatio-temporal dynamics involves the employment of physical models such as differential equations (DEs). By using the observed data to estimate the parameters in these corresponding DEs, it is possible to gain physical insights regarding their evolution [12, 20]. However, directly applying such techniques to image dynamics is of limited use due to the intricate description that would be required by a suitable prior model, the highly nonlinear relationship among pixels, and the computational complexities arising from the high dimensionality of the images.

artificial intelligence, latent space, machine learning, (19 more...)

arXiv.org Machine Learning

2310.09495

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

An analysis of the derivative-free loss method for solving PDEs

Han, Jihun, Lee, Yoonsang

arXiv.org Machine LearningSep-28-2023

The neural network is well known for its flexibility to represent complicated functions in a highdimensional space [3, 9]. In recent years, this strong property of the neural network has naturally led to representing the solution of partial differential equations (PDEs). Physics-informed neural network [16] and Deep Galerkin [17] use the strong form of the PDE to define the training loss, while the Deep Ritz [4] method uses a weak (or variational) formulation of PDEs to train the network. Also, a class of methods uses a stochastic representation of PDEs to train the neural network [5, 8]. All these methods have shown successful results in a wide range of problems in science and engineering, particularly for high-dimensional problems where the standard numerical PDE methods have limitations [5, 17, 2]. The goal of the current study is an analysis of the derivative-free loss method (DFLM; [8]). DFLM employs a stochastic representation of the solution for a certain class of PDEs, averaging stochastic samples as a generalized Feynman-Kac formulation. The loss formulation of DFLM directly guides a neural network to learn the point-to-neighborhood relationships of the solution. DFLM adopts bootstrapping in the context of reinforcement learning, where the neural network's target function is computed based on its current state through the point-to-neighborhood relation.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2309.16829

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback