AITopics | Ying, Lexing

Collaborating Authors

Ying, Lexing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Unified Approach to Analysis and Design of Denoising Markov Models

Ren, Yinuo, Rotskoff, Grant M., Ying, Lexing

arXiv.org Machine LearningApr-2-2025

Probabilistic generative models based on measure transport, such as diffusion and flow-based models, are often formulated in the language of Markovian stochastic dynamics, where the choice of the underlying process impacts both algorithmic design choices and theoretical analysis. In this paper, we aim to establish a rigorous mathematical foundation for denoising Markov models, a broad class of generative models that postulate a forward process transitioning from the target distribution to a simple, easy-to-sample distribution, alongside a backward process particularly constructed to enable efficient sampling in the reverse direction. Leveraging deep connections with nonequilibrium statistical mechanics and generalized Doob's $h$-transform, we propose a minimal set of assumptions that ensure: (1) explicit construction of the backward generator, (2) a unified variational objective directly minimizing the measure transport discrepancy, and (3) adaptations of the classical score-matching approach across diverse dynamics. Our framework unifies existing formulations of continuous and discrete diffusion models, identifies the most general form of denoising Markov models under certain regularity assumptions on forward generators, and provides a systematic recipe for designing denoising Markov models driven by arbitrary L\'evy-type processes. We illustrate the versatility and practical effectiveness of our approach through novel denoising Markov models employing geometric Brownian motion and jump processes as forward dynamics, highlighting the framework's potential flexibility and capability in modeling complex distributions.

arxiv preprint arxiv, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2504.01938

Country: Europe (0.45)

Genre: Research Report (0.49)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms

Ren, Yinuo, Chen, Haoxuan, Zhu, Yuchen, Guo, Wei, Chen, Yongxin, Rotskoff, Grant M., Tao, Molei, Ying, Lexing

arXiv.org Machine LearningJan-31-2025

Discrete diffusion models have emerged as a powerful generative modeling framework for discrete data with successful applications spanning from text generation to image synthesis. However, their deployment faces challenges due to the high dimensionality of the state space, necessitating the development of efficient inference algorithms. Current inference approaches mainly fall into two categories: exact simulation and approximate methods such as $\tau$-leaping. While exact methods suffer from unpredictable inference time and redundant function evaluations, $\tau$-leaping is limited by its first-order accuracy. In this work, we advance the latter category by tailoring the first extension of high-order numerical inference schemes to discrete diffusion models, enabling larger step sizes while reducing error. We rigorously analyze the proposed schemes and establish the second-order accuracy of the $\theta$-trapezoidal method in KL divergence. Empirical evaluations on GPT-2 level text and ImageNet-level image generation tasks demonstrate that our method achieves superior sample quality compared to existing approaches under equivalent computational constraints.

diffusion model, large language model, machine learning, (14 more...)

arXiv.org Machine Learning

2502.00234

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Add feedback

HyperDPO: Conditioned One-Shot Multi-Objective Fine-Tuning Framework

Ren, Yinuo, Xiao, Tesi, Shavlovsky, Michael, Ying, Lexing, Rahmanian, Holakou

arXiv.org Artificial IntelligenceDec-3-2024

In LLM alignment and many other ML applications, one often faces the Multi-Objective Fine-Tuning (MOFT) problem, i.e. fine-tuning an existing model with datasets labeled w.r.t. different objectives simultaneously. To address the challenge, we propose the HyperDPO framework, a conditioned one-shot fine-tuning approach that extends the Direct Preference Optimization (DPO) technique, originally developed for efficient LLM alignment with preference data, to accommodate the MOFT settings. By substituting the Bradley-Terry-Luce model in DPO with the Plackett-Luce model, our framework is capable of handling a wide range of MOFT tasks that involve listwise ranking datasets. Compared with previous approaches, HyperDPO enjoys an efficient one-shot training process for profiling the Pareto front of auxiliary objectives, and offers post-training control over trade-offs. Additionally, we propose a novel Hyper Prompt Tuning design, that conveys continuous importance weight across objectives to transformer-based models without altering their architecture, and investigate the potential of temperature-conditioned networks for enhancing the flexibility of post-training control. We demonstrate the effectiveness and efficiency of the HyperDPO framework through its applications to various tasks, including Learning-to-Rank (LTR) and LLM alignment, highlighting its viability for large-scale ML deployments.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.08316

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework

Ren, Yinuo, Chen, Haoxuan, Rotskoff, Grant M., Ying, Lexing

arXiv.org Machine LearningOct-4-2024

Discrete diffusion models have gained increasing attention for their ability to model complex distributions with tractable sampling and inference. However, the error analysis for discrete diffusion models remains less well-understood. In this work, we propose a comprehensive framework for the error analysis of discrete diffusion models based on L\'evy-type stochastic integrals. By generalizing the Poisson random measure to that with a time-independent and state-dependent intensity, we rigorously establish a stochastic integral formulation of discrete diffusion models and provide the corresponding change of measure theorems that are intriguingly analogous to It\^o integrals and Girsanov's theorem for their continuous counterparts. Our framework unifies and strengthens the current theoretical results on discrete diffusion models and obtains the first error bound for the $\tau$-leaping scheme in KL divergence. With error sources clearly identified, our analysis gives new insight into the mathematical properties of discrete diffusion models and offers guidance for the design of efficient and accurate algorithms for real-world discrete diffusion model applications.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Machine Learning

2410.03601

Country: North America > United States (0.28)

Genre: Overview (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)

Add feedback

Tangent differential privacy

Ying, Lexing

arXiv.org Machine LearningJun-6-2024

Differential privacy is a framework for protecting the identity of individual data points in the decision-making process. In this note, we propose a new form of differential privacy called tangent differential privacy. Compared with the usual differential privacy that is defined uniformly across data distributions, tangent differential privacy is tailored towards a specific data distribution of interest. It also allows for general distribution distances such as total variation distance and Wasserstein distance. In the case of risk minimization, we show that entropic regularization guarantees tangent differential privacy under rather general conditions on the risk function.

artificial intelligence, differential privacy, machine learning, (16 more...)

arXiv.org Machine Learning

2406.04535

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity

Chen, Haoxuan, Ren, Yinuo, Ying, Lexing, Rotskoff, Grant M.

arXiv.org Machine LearningMay-24-2024

Diffusion models have become a leading method for generativ e modeling of both image and scientific data. As these models are costly to train and evaluate, reducing the inference cost for diffusion models remains a maj or goal. Inspired by the recent empirical success in accelerating diffusion mod els via the parallel sampling technique [1], we propose to divide the sampling proce ss into O (1) blocks with parallelizable Picard iterations within each block. R igorous theoretical analysis reveals that our algorithm achieves null O (poly log d) overall time complexity, marking the first implementation with provable sub-linear complexi ty w.r .t. the data dimension d. Our analysis is based on a generalized version of Girsanov' s theorem and is compatible with both the SDE and probability fl ow ODE implementations. Our results shed light on the potential of fast a nd efficient sampling of high-dimensional data on fast-evolving modern large-me mory GPU clusters.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Machine Learning

2405.15986

Country: North America (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

A note on continuous-time online learning

Ying, Lexing

arXiv.org Machine LearningMay-16-2024

In online learning, the data is provided in a sequential order, and the goal of the learner is to make online decisions to minimize overall regrets. This note is concerned with continuous-time models and algorithms for several online learning problems: online linear optimization, adversarial bandit, and adversarial linear bandit. For each problem, we extend the discrete-time algorithm to the continuous-time setting and provide a concise proof of the optimal regret bound.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2405.10399

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.87)

Add feedback

Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

Liu, Kaizhao, Blanchet, Jose, Ying, Lexing, Lu, Yiping

arXiv.org Machine LearningApr-30-2024

Bootstrap is a popular methodology for simulating input uncertainty. However, it can be computationally expensive when the number of samples is large. We propose a new approach called \textbf{Orthogonal Bootstrap} that reduces the number of required Monte Carlo replications. We decomposes the target being simulated into two parts: the \textit{non-orthogonal part} which has a closed-form result known as Infinitesimal Jackknife and the \textit{orthogonal part} which is easier to be simulated. We theoretically and numerically show that Orthogonal Bootstrap significantly reduces the computational cost of Bootstrap while improving empirical accuracy and maintaining the same width of the constructed interval.

artificial intelligence, bootstrap, machine learning, (15 more...)

arXiv.org Machine Learning

2404.19145

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Sinkhorn-type Algorithm for Constrained Optimal Transport

Tang, Xun, Rahmanian, Holakou, Shavlovsky, Michael, Thekumparampil, Kiran Koshy, Xiao, Tesi, Ying, Lexing

arXiv.org Artificial IntelligenceMar-8-2024

Entropic optimal transport (OT) and the Sinkhorn algorithm have made it practical for machine learning practitioners to perform the fundamental task of calculating transport distance between statistical distributions. In this work, we focus on a general class of OT problems under a combination of equality and inequality constraints. We derive the corresponding entropy regularization formulation and introduce a Sinkhorn-type algorithm for such constrained OT problems supported by theoretical guarantees. We first bound the approximation error when solving the problem through entropic regularization, which reduces exponentially with the increase of the regularization parameter. Furthermore, we prove a sublinear first-order convergence rate of the proposed Sinkhorn-type algorithm in the dual space by characterizing the optimization procedure with a Lyapunov function. To achieve fast and higher-order convergence under weak entropy regularization, we augment the Sinkhorn-type algorithm with dynamic regularization scheduling and second-order acceleration. Overall, this work systematically combines recent theoretical and numerical advances in entropic optimal transport with the constrained case, allowing practitioners to derive approximate transport plans in complex scenarios.

algorithm, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2403.05054

Country:

North America > United States (0.14)
Europe > North Macedonia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multidimensional unstructured sparse recovery via eigenmatrix

Ying, Lexing

arXiv.org Artificial IntelligenceFeb-27-2024

This note considers the multidimensional unstructured sparse recovery problems. Examples include Fourier inversion and sparse deconvolution. The eigenmatrix is a data-driven construction with desired approximate eigenvalues and eigenvectors proposed for the one-dimensional problems. This note extends the eigenmatrix approach to multidimensional problems. Numerical results are provided to demonstrate the performance of the proposed method.

artificial intelligence, chebyshev grid, recovery problem, (11 more...)

arXiv.org Artificial Intelligence

2402.17215

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback