AITopics | Li, Zehao

Collaborating Authors

Li, Zehao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A New Stochastic Approximation Method for Gradient-based Simulated Parameter Estimation

Li, Zehao, Peng, Yijie

arXiv.org Machine LearningMar-23-2025

This paper tackles the challenge of parameter calibration in stochastic models, particularly in scenarios where the likelihood function is unavailable in an analytical form. We introduce a gradient-based simulated parameter estimation framework, which employs a multi-time scale stochastic approximation algorithm. This approach effectively addresses the ratio bias that arises in both maximum likelihood estimation and posterior density estimation problems. The proposed algorithm enhances estimation accuracy and significantly reduces computational costs, as demonstrated through extensive numerical experiments. Our work extends the GSPE framework to handle complex models such as hidden Markov models and variational inference-based problems, offering a robust solution for parameter estimation in challenging stochastic environments.

Add feedback

Zeroth-order Informed Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer

Ren, Tao, Zhang, Zishi, Li, Zehao, Jiang, Jingyang, Qin, Shentao, Li, Guanghao, Li, Yan, Zheng, Yi, Li, Xinping, Zhan, Min, Peng, Yijie

arXiv.org Machine LearningFeb-1-2025

The probabilistic diffusion model (DM), generating content by inferencing through a recursive chain structure, has emerged as a powerful framework for visual generation. After pre-training on enormous unlabeled data, the model needs to be properly aligned to meet requirements for downstream applications. How to efficiently align the foundation DM is a crucial task. Contemporary methods are either based on Reinforcement Learning (RL) or truncated Backpropagation (BP). However, RL and truncated BP suffer from low sample efficiency and biased gradient estimation respectively, resulting in limited improvement or, even worse, complete training failure. To overcome the challenges, we propose the Recursive Likelihood Ratio (RLR) optimizer, a zeroth-order informed fine-tuning paradigm for DM. The zeroth-order gradient estimator enables the computation graph rearrangement within the recursive diffusive chain, making the RLR's gradient estimator an unbiased one with the lower variance than other methods. We provide theoretical guarantees for the performance of the RLR. Extensive experiments are conducted on image and video generation tasks to validate the superiority of the RLR. Furthermore, we propose a novel prompt technique that is natural for the RLR to achieve a synergistic effect.

artificial intelligence, estimator, machine learning, (10 more...)

arXiv.org Machine Learning

2502.00639

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Context-DPO: Aligning Language Models for Context-Faithfulness

Bi, Baolong, Huang, Shaohan, Wang, Yiwei, Yang, Tianchi, Zhang, Zihan, Huang, Haizhen, Mei, Lingrui, Fang, Junfeng, Li, Zehao, Wei, Furu, Deng, Weiwei, Sun, Feng, Zhang, Qi, Liu, Shenghua

arXiv.org Artificial IntelligenceDec-17-2024

Reliable responses from large language models (LLMs) require adherence to user instructions and retrieved information. While alignment techniques help LLMs align with human intentions and values, improving context-faithfulness through alignment remains underexplored. To address this, we propose $\textbf{Context-DPO}$, the first alignment method specifically designed to enhance LLMs' context-faithfulness. We introduce $\textbf{ConFiQA}$, a benchmark that simulates Retrieval-Augmented Generation (RAG) scenarios with knowledge conflicts to evaluate context-faithfulness. By leveraging faithful and stubborn responses to questions with provided context from ConFiQA, our Context-DPO aligns LLMs through direct preference optimization. Extensive experiments demonstrate that our Context-DPO significantly improves context-faithfulness, achieving 35% to 280% improvements on popular open-source models. Further analysis demonstrates that Context-DPO preserves LLMs' generative capabilities while providing interpretable insights into context utilization. Our code and data are released at https://github.com/byronBBL/Context-DPO

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.1528

Country:

Asia (1.00)
Europe > United Kingdom (0.94)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Personal (1.00)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment > Sports > Soccer (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation

Li, Zehao, Peng, Yijie

arXiv.org Machine LearningNov-19-2024

This article addresses the challenge of parameter calibration in stochastic models where the likelihood function is not analytically available. We propose a gradient-based simulated parameter estimation framework, leveraging a multi-time scale algorithm that tackles the issue of ratio bias in both maximum likelihood estimation and posterior density estimation problems. Additionally, we introduce a nested simulation optimization structure, providing theoretical analyses including strong convergence, asymptotic normality, convergence rate, and budget allocation strategies for the proposed algorithm. The framework is further extended to neural network training, offering a novel perspective on stochastic approximation in machine learning. Numerical experiments show that our algorithm can improve the estimation accuracy and save computational costs.

artificial intelligence, machine learning, ratio bias, (13 more...)

arXiv.org Machine Learning

2411.12995

Country:

Asia > China (0.14)
Europe > France (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and Replenishment

Zheng, Yi, Li, Zehao, Jiang, Peng, Peng, Yijie

arXiv.org Artificial IntelligenceOct-28-2024

We study the dynamic pricing and replenishment problems under inconsistent papers to INFORMS journals by means decision frequencies. Different from the traditional demand assumption, the of a style file template, which includes discreteness of demand and the parameter within the Poisson distribution as a function the journal title. However, use of a template of price introduce complexity into analyzing the problem property. We demonstrate does not certify that the paper the concavity of the single-period profit function with respect to product price and has been accepted for publication in the inventory within their respective domains. The demand model is enhanced by integrating named journal. INFORMS journal templates a decision tree-based machine learning approach, trained on comprehensive are for the exclusive purpose of market data. Employing a two-timescale stochastic approximation scheme, we address submitting to an INFORMS journal and the discrepancies in decision frequencies between pricing and replenishment, ensuring are not intended to be a true representation convergence to local optimum. We further refine our methodology by incorporating of the article's final published form.

artificial intelligence, dual-agent deep reinforcement learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2410.21109

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback