AITopics | Yue, Yisong

Collaborating Authors

Yue, Yisong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

Zheng, Hongkai, Chu, Wenda, Zhang, Bingliang, Wu, Zihui, Wang, Austin, Feng, Berthy T., Zou, Caifeng, Sun, Yu, Kovachki, Nikola, Ross, Zachary E., Bouman, Katherine L., Yue, Yisong

arXiv.org Artificial IntelligenceMar-13-2025

Plug-and-play diffusion priors (PnPDP) have emerged as a promising research direction for solving inverse problems. However, current studies primarily focus on natural image restoration, leaving the performance of these algorithms in scientific inverse problems largely unexplored. To address this gap, we introduce \textsc{InverseBench}, a framework that evaluates diffusion models across five distinct scientific inverse problems. These problems present unique structural challenges that differ from existing benchmarks, arising from critical scientific applications such as optical tomography, medical imaging, black hole imaging, seismology, and fluid dynamics. With \textsc{InverseBench}, we benchmark 14 inverse problem algorithms that use plug-and-play diffusion priors against strong, domain-specific baselines, offering valuable new insights into the strengths and weaknesses of existing algorithms. To facilitate further research and development, we open-source the codebase, along with datasets and pre-trained models, at https://devzhk.github.io/InverseBench/.

artificial intelligence, inverse problem, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.11043

Country:

North America > United States (0.14)
Oceania > Australia (0.14)
North America > Canada (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Split Gibbs Discrete Diffusion Posterior Sampling

Chu, Wenda, Song, Yang, Yue, Yisong

arXiv.org Artificial IntelligenceMar-2-2025

We study the problem of posterior sampling in discrete-state spaces using discrete diffusion models. While posterior sampling methods for continuous diffusion models have achieved remarkable progress, analogous methods for discrete diffusion models remain challenging. In this work, we introduce a principled plug-and-play discrete diffusion posterior sampling algorithm based on split Gibbs sampling, which we call SG-DPS. Our algorithm enables reward-guided generation and solving inverse problems in discrete-state spaces. We demonstrate that SG-DPS converges to the true posterior distribution on synthetic benchmarks, and enjoys state-of-the-art posterior sampling performance on a range of benchmarks for discrete data, achieving up to 2x improved performance compared to existing baselines.

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.01161

Genre: Research Report (0.65)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

DataSciBench: An LLM Agent Benchmark for Data Science

Zhang, Dan, Zhoubian, Sining, Cai, Min, Li, Fengzu, Yang, Lekang, Wang, Wei, Dong, Tianjiao, Hu, Ziniu, Tang, Jie, Yue, Yisong

arXiv.org Artificial IntelligenceFeb-19-2025

This paper presents DataSciBench, a comprehensive benchmark for evaluating Large Language Model (LLM) capabilities in data science. Recent related benchmarks have primarily focused on single tasks, easily obtainable ground truth, and straightforward evaluation metrics, which limits the scope of tasks that can be evaluated. In contrast, DataSciBench is constructed based on a more comprehensive and curated collection of natural and challenging prompts for uncertain ground truth and evaluation metrics. We develop a semi-automated pipeline for generating ground truth (GT) and validating evaluation metrics. This pipeline utilizes and implements an LLM-based self-consistency and human verification strategy to produce accurate GT by leveraging collected prompts, predefined task types, and aggregate functions (metrics). Furthermore, we propose an innovative Task - Function - Code (TFC) framework to assess each code execution outcome based on precisely defined metrics and programmatic rules. Our experimental framework involves testing 6 API-based models, 8 open-source general models, and 9 open-source code generation models using the diverse set of prompts we have gathered. This approach aims to provide a more comprehensive and rigorous evaluation of LLMs in data science, revealing their strengths and weaknesses. Experimental results demonstrate that API-based models outperform open-sourced models on all metrics and Deepseek-Coder-33B-Instruct achieves the highest score among open-sourced models. We release all code and data at https://github.com/THUDM/DataSciBench.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.13897

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Morphological-Symmetry-Equivariant Heterogeneous Graph Neural Network for Robotic Dynamics Learning

Xie, Fengze, Wei, Sizhe, Song, Yue, Yue, Yisong, Gan, Lu

arXiv.org Artificial IntelligenceDec-2-2024

These structural priors are embedded into the learning architecture as constraints, ensuring high generalizability, sample and model efficiency. The proposed MS-HGNN is a versatile and general architecture that is applicable to various multi-body dynamic systems and a wide range of dynamics learning problems.

artificial intelligence, machine learning, symmetry, (17 more...)

arXiv.org Artificial Intelligence

2412.01297

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making

Light, Jonathan, Xing, Sixue, Liu, Yuanzhe, Chen, Weiqin, Cai, Min, Chen, Xiusi, Wang, Guanzhi, Cheng, Wei, Yue, Yisong, Hu, Ziniu

arXiv.org Artificial IntelligenceNov-24-2024

Effective extraction of the world knowledge in LLMs for complex decision-making tasks remains a challenge. We propose a framework PIANIST for decomposing the world model into seven intuitive components conducive to zero-shot LLM generation. Given only the natural language description of the game and how input observations are formatted, our method can generate a working world model for fast and efficient MCTS simulation. We show that our method works well on two different games that challenge the planning and decision making skills of the agent for both language and non-language based action taking, without any training on domain-specific training data or explicitly defined world model.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.15998

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Practical Bayesian Algorithm Execution via Posterior Sampling

Cheng, Chu Xin, Astudillo, Raul, Desautels, Thomas, Yue, Yisong

arXiv.org Machine LearningOct-27-2024

We consider Bayesian algorithm execution (BAX), a framework for efficiently selecting evaluation points of an expensive function to infer a property of interest encoded as the output of a base algorithm. Since the base algorithm typically requires more evaluations than are feasible, it cannot be directly applied. Instead, BAX methods sequentially select evaluation points using a probabilistic numerical approach. Current BAX methods use expected information gain to guide this selection. However, this approach is computationally intensive. Observing that, in many tasks, the property of interest corresponds to a target set of points defined by the function, we introduce PS-BAX, a simple, effective, and scalable BAX method based on posterior sampling. PS-BAX is applicable to a wide range of problems, including many optimization variants and level set estimation. Experiments across diverse tasks demonstrate that PS-BAX performs competitively with existing baselines while being significantly faster, simpler to implement, and easily parallelizable, setting a strong baseline for future research. Additionally, we establish conditions under which PS-BAX is asymptotically convergent, offering new insights into posterior sampling as an algorithm design paradigm.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

2410.20596

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unsupervised Representation Learning from Sparse Transformation Analysis

Song, Yue, Keller, Thomas Anderson, Yue, Yisong, Perona, Pietro, Welling, Max

arXiv.org Artificial IntelligenceOct-7-2024

There is a vast literature on representation learning based on principles such as coding efficiency, statistical independence, causality, controllability, or symmetry. In this paper we propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components. Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model, before being decoded to predict a future input state. The flow model is decomposed into a number of rotational (divergence-free) vector fields and a number of potential flow (curl-free) fields. Our sparsity prior encourages only a small number of these fields to be active at any instant and infers the speed with which the probability flows along these fields. Training this model is completely unsupervised using a standard variational objective and results in a new form of disentangled representations where the input is not only represented by a combination of independent factors, but also by a combination of independent transformation primitives given by the learned flow fields. When viewing the transformations as symmetries one may interpret this as learning approximately equivariant representations. Empirically we demonstrate that this model achieves state of the art in terms of both data likelihood and unsupervised approximate equivariance errors on datasets composed of sequence transformations.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.05564

Country:

Europe > Netherlands (0.14)
Europe > Italy (0.14)
North America > United States (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)

Add feedback

End-to-End Conformal Calibration for Optimization Under Uncertainty

Yeh, Christopher, Christianson, Nicolas, Wu, Alan, Wierman, Adam, Yue, Yisong

arXiv.org Artificial IntelligenceSep-30-2024

Machine learning can significantly improve performance for decision-making under uncertainty in a wide range of domains. However, ensuring robustness guarantees requires well-calibrated uncertainty estimates, which can be difficult to achieve in high-capacity prediction models such as deep neural networks. Moreover, in high-dimensional settings, there may be many valid uncertainty estimates, each with their own performance profile - i.e., not all uncertainty is equally valuable for downstream decision-making. To address this problem, this paper develops an end-to-end framework to learn the uncertainty estimates for conditional robust optimization, with robustness and calibration guarantees provided by conformal prediction. In addition, we propose to represent arbitrary convex uncertainty sets with partially input-convex neural networks, which are learned as part of our framework. Our approach consistently improves upon two-stage estimate-then-optimize baselines on concrete applications in energy storage arbitrage and portfolio optimization.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2409.20534

Country: North America > United States > California > Los Angeles County (0.14)

Genre: Research Report (0.65)

Industry:

Energy > Power Industry (0.93)
Energy > Energy Storage (0.67)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems

Zheng, Hongkai, Chu, Wenda, Wang, Austin, Kovachki, Nikola, Baptista, Ricardo, Yue, Yisong

arXiv.org Machine LearningSep-30-2024

When solving inverse problems, it is increasingly popular to use pre-trained diffusion models as plug-and-play priors. This framework can accommodate different forward models without re-training while preserving the generative capability of diffusion models. Despite their success in many imaging inverse problems, most existing methods rely on privileged information such as derivative, pseudo-inverse, or full knowledge about the forward model. This reliance poses a substantial limitation that restricts their use in a wide range of problems where such information is unavailable, such as in many scientific applications. To address this issue, we propose Ensemble Kalman Diffusion Guidance (EnKG) for diffusion models, a derivative-free approach that can solve inverse problems by only accessing forward model evaluations and a pre-trained diffusion model prior. We study the empirical effectiveness of our method across various inverse problems, including scientific settings such as inferring fluid flows and astronomical objects, which are highly non-linear inverse problems that often only permit black-box access to the forward model.

artificial intelligence, inverse problem, machine learning, (18 more...)

arXiv.org Machine Learning

2409.20175

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.64)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents

Xia, Fanzeng, Liu, Hao, Yue, Yisong, Li, Tongxin

arXiv.org Artificial IntelligenceJul-1-2024

In-context decision-making is an important capability of artificial general intelligence, which Large Language Models (LLMs) have effectively demonstrated in various scenarios. However, LLMs often face challenges when dealing with numerical contexts, and limited attention has been paid to evaluating their performance through preference feedback generated by the environment. This paper investigates the performance of LLMs as decision-makers in the context of Dueling Bandits (DB). We first evaluate the performance of LLMs by comparing GPT-3.5-Turbo, GPT-4, and GPT-4-Turbo against established DB algorithms. Our results reveal that LLMs, particularly GPT-4 Turbo, quickly identify the Condorcet winner, thus outperforming existing state-of-the-art algorithms in terms of weak regret. Nevertheless, LLMs struggle to converge even when explicitly prompted to do so, and are sensitive to prompt variations. To overcome these issues, we introduce an LLM-augmented algorithm, IF-Enhanced LLM, which takes advantage of both in-context decision-making capabilities of LLMs and theoretical guarantees inherited from classic DB algorithms. The design of such an algorithm sheds light on how to enhance trustworthiness for LLMs used in decision-making tasks where performance robustness matters. We show that IF-Enhanced LLM has theoretical guarantees on both weak and strong regret. Our experimental results validate that IF-Enhanced LLM is robust even with noisy and adversarial prompts.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.01887

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback