AITopics | Liu, Yufeng

Collaborating Authors

Liu, Yufeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

Lee, Seong Jin, Sun, Will Wei, Liu, Yufeng

arXiv.org Machine LearningDec-26-2024

Reinforcement learning from human feedback (RLHF) has become a cornerstone for aligning large language models with human preferences. However, the heterogeneity of human feedback, driven by diverse individual contexts and preferences, poses significant challenges for reward learning. To address this, we propose a Low-rank Contextual RLHF (LoCo-RLHF) framework that integrates contextual information to better model heterogeneous feedback while maintaining computational efficiency. Our approach builds on a contextual preference model, leveraging the intrinsic low-rank structure of the interaction between user contexts and query-answer pairs to mitigate the high dimensionality of feature representations. Furthermore, we address the challenge of distributional shifts in feedback through our Pessimism in Reduced Subspace (PRS) policy, inspired by pessimistic offline reinforcement learning techniques. We theoretically demonstrate that our policy achieves a tighter sub-optimality gap compared to existing methods. Extensive experiments validate the effectiveness of LoCo-RLHF, showcasing its superior performance in personalized RLHF settings and its robustness to distribution shifts.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Machine Learning

2412.19436

Country:

North America > United States > New York (0.14)
North America > United States > North Carolina (0.14)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Minimax Regret Learning for Data with Heterogeneous Subgroups

Mo, Weibin, Tang, Weijing, Xue, Songkai, Liu, Yufeng, Zhu, Ji

arXiv.org Machine LearningMay-2-2024

Modern complex datasets often consist of various sub-populations. To develop robust and generalizable methods in the presence of sub-population heterogeneity, it is important to guarantee a uniform learning performance instead of an average one. In many applications, prior information is often available on which sub-population or group the data points belong to. Given the observed groups of data, we develop a min-max-regret (MMR) learning framework for general supervised learning, which targets to minimize the worst-group regret. Motivated from the regret-based decision theoretic framework, the proposed MMR is distinguished from the value-based or risk-based robust learning methods in the existing literature. The regret criterion features several robustness and invariance properties simultaneously. In terms of generalizability, we develop the theoretical guarantee for the worst-case regret over a super-population of the meta data, which incorporates the observed sub-populations, their mixtures, as well as other unseen sub-populations that could be approximated by the observed ones. We demonstrate the effectiveness of our method through extensive simulation studies and an application to kidney transplantation data from hundreds of transplant centers.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2405.01709

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Low-Rank Online Dynamic Assortment with Dual Contextual Information

Lee, Seong Jin, Sun, Will Wei, Liu, Yufeng

arXiv.org Machine LearningApr-19-2024

As e-commerce expands, delivering real-time personalized recommendations from vast catalogs poses a critical challenge for retail platforms. Maximizing revenue requires careful consideration of both individual customer characteristics and available item features to optimize assortments over time. In this paper, we consider the dynamic assortment problem with dual contexts -- user and item features. In high-dimensional scenarios, the quadratic growth of dimensions complicates computation and estimation. To tackle this challenge, we introduce a new low-rank dynamic assortment model to transform this problem into a manageable scale. Then we propose an efficient algorithm that estimates the intrinsic subspaces and utilizes the upper confidence bound approach to address the exploration-exploitation trade-off in online decision making. Theoretically, we establish a regret bound of $\tilde{O}((d_1+d_2)r\sqrt{T})$, where $d_1, d_2$ represent the dimensions of the user and item features respectively, $r$ is the rank of the parameter matrix, and $T$ denotes the time horizon. This bound represents a substantial improvement over prior literature, made possible by leveraging the low-rank structure. Extensive simulations and an application to the Expedia hotel recommendation dataset further demonstrate the advantages of our proposed method.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2404.17592

Country: North America > United States > North Carolina (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.52)
Information Technology > Services (0.34)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Consistency of Lloyd's Algorithm Under Perturbations

Patel, Dhruv, Shen, Hui, Bhamidi, Shankar, Liu, Yufeng, Pipiras, Vladas

arXiv.org Artificial IntelligenceSep-1-2023

In the context of unsupervised learning, Lloyd's algorithm is one of the most widely used clustering algorithms. It has inspired a plethora of work investigating the correctness of the algorithm under various settings with ground truth clusters. In particular, in 2016, Lu and Zhou have shown that the mis-clustering rate of Lloyd's algorithm on $n$ independent samples from a sub-Gaussian mixture is exponentially bounded after $O(\log(n))$ iterations, assuming proper initialization of the algorithm. However, in many applications, the true samples are unobserved and need to be learned from the data via pre-processing pipelines such as spectral methods on appropriate data matrices. We show that the mis-clustering rate of Lloyd's algorithm on perturbed samples from a sub-Gaussian mixture is also exponentially bounded after $O(\log(n))$ iterations under the assumptions of proper initialization and that the perturbation is small relative to the sub-Gaussian noise. In canonical settings with ground truth clusters, we derive bounds for algorithms such as $k$-means$++$ to find good initializations and thus leading to the correctness of clustering via the main result. We show the implications of the results for pipelines measuring the statistical significance of derived clusters from data such as SigClust. We use these general results to derive implications in providing theoretical guarantees on the misclustering rate for Lloyd's algorithm in a host of applications, including high-dimensional time series, multi-dimensional scaling, and community detection for sparse networks via spectral clustering.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2309.00578

Country:

North America > United States > North Carolina (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > Experimental Study (0.65)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Health & Medicine (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Asymptotic Inference for Multi-Stage Stationary Treatment Policy with High Dimensional Features

Gao, Daiqi, Liu, Yufeng, Zeng, Donglin

arXiv.org Artificial IntelligenceMay-22-2023

Dynamic treatment rules or policies are a sequence of decision functions over multiple stages that are tailored to individual features. One important class of treatment policies for practice, namely multi-stage stationary treatment policies, prescribe treatment assignment probabilities using the same decision function over stages, where the decision is based on the same set of features consisting of both baseline variables (e.g., demographics) and time-evolving variables (e.g., routinely collected disease biomarkers). Although there has been extensive literature to construct valid inference for the value function associated with the dynamic treatment policies, little work has been done for the policies themselves, especially in the presence of high dimensional feature variables. We aim to fill in the gap in this work. Specifically, we first estimate the multistage stationary treatment policy based on an augmented inverse probability weighted estimator for the value function to increase the asymptotic efficiency, and further apply a penalty to select important feature variables. We then construct one-step improvement of the policy parameter estimators. Theoretically, we show that the improved estimators are asymptotically normal, even if nuisance parameters are estimated at a slow convergence rate and the dimension of the feature variables increases with the sample size. Our numerical studies demonstrate that the proposed method has satisfactory performance in small samples, and that the performance can be improved with a choice of the augmentation term that approximates the rewards or minimizes the variance of the value function.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.12553

Country: North America > United States > Massachusetts (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Rejoinder: Learning Optimal Distributionally Robust Individualized Treatment Rules

Mo, Weibin, Qi, Zhengling, Liu, Yufeng

arXiv.org Machine LearningOct-17-2021

We thank the opportunity offered by editors for this discussion and the discussants for their insightful comments and thoughtful contributions. We also want to congratulate Kallus (2020) for his inspiring work in improving the efficiency of policy learning by retargeting. Motivated from the discussion in Dukes and Vansteelandt (2020), we first point out interesting connections and distinctions between our work and Kallus (2020) in Section 1. In particular, the assumptions and sources of variation for consideration in these two papers lead to different research problems with different scopes and focuses. In Section 2, following the discussions in Li et al. (2020); Liang and Zhao (2020), we also consider the efficient policy evaluation problem when we have some data from the testing distribution available at the training stage. We show that under the assumption that the sample sizes from training and testing are growing in the same order, efficient value function estimates can deliver competitive performance. We further show some connections of these estimates with existing literature. However, when the growth of testing sample size available for training is in a slower order, efficient value function estimates may not perform well anymore. In contrast, the requirement of the testing sample size for DRITR is not as strong as that of efficient policy evaluation using the combined data. Finally, we highlight the general applicability and usefulness of DRITR in Section 3.

artificial intelligence, health & medicine, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1080/01621459.2020.1866581

2110.08936

Country: North America > United States > North Carolina (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Efficient Learning of Optimal Individualized Treatment Rules for Heteroscedastic or Misspecified Treatment-Free Effect Models

Mo, Weibin, Liu, Yufeng

arXiv.org Machine LearningSep-6-2021

Recent development in data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, researchers can search for the optimal individualized treatment rule (ITR) that maximizes the expected outcome. Existing methods typically require initial estimation of some nuisance models. The double robustness property that can protect from misspecification of either the treatment-free effect or the propensity score has been widely advocated. However, when model misspecification exists, a doubly robust estimate can be consistent but may suffer from downgraded efficiency. Other than potential misspecified nuisance models, most existing methods do not account for the potential problem when the variance of outcome is heterogeneous among covariates and treatment. We observe that such heteroscedasticity can greatly affect the estimation efficiency of the optimal ITR. In this paper, we demonstrate that the consequences of misspecified treatment-free effect and heteroscedasticity can be unified as a covariate-treatment dependent variance of residuals. To improve efficiency of the estimated ITR, we propose an Efficient Learning (E-Learning) framework for finding an optimal ITR in the multi-armed treatment setting. We show that the proposed E-Learning is optimal among a regular class of semiparametric estimates that can allow treatment-free effect misspecification. In our simulation study, E-Learning demonstrates its effectiveness if one of or both misspecified treatment-free effect and heteroscedasticity exist. Our analysis of a Type 2 Diabetes Mellitus (T2DM) observational study also suggests the improved efficiency of E-Learning.

diabetes, e-learning, educational setting, (17 more...)

arXiv.org Machine Learning

2109.0257

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback

Learning Optimal Distributionally Robust Individualized Treatment Rules

Mo, Weibin, Qi, Zhengling, Liu, Yufeng

arXiv.org Machine LearningJun-26-2020

Recent development in the data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that maximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. However, the estimated optimal ITR may have poor generalizability when the training and testing distributions are not identical. In this paper, we consider the problem of finding an optimal ITR from a restricted ITR class where there is some unknown covariate changes between the training and testing distributions. We propose a novel distributionally robust ITR (DR-ITR) framework that maximizes the worst-case value function across the values under a set of underlying distributions that are "close" to the training distribution. The resulting DR-ITR can guarantee the performance among all such distributions reasonably well. We further propose a calibrating procedure that tunes the DR-ITR adaptively to a small amount of calibration data from a target population. In this way, the calibrated DR-ITR can be shown to enjoy better generalizability than the standard ITR based on our numerical studies.

immunology, internal medicine, testing distribution, (20 more...)

arXiv.org Machine Learning

2006.15121

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.66)

Add feedback

Adaptive Wasserstein Hourglass for Weakly Supervised Hand Pose Estimation from Monocular RGB

Zhang, Yumeng, Chen, Li, Liu, Yufeng, Yong, Junhai, Zheng, Wen

arXiv.org Artificial IntelligenceSep-10-2019

Insufficient labeled training datasets is one of the bottlenecks of 3D hand pose estimation from monocular RGB images. Synthetic datasets have a large number of images with precise annotations, but the obvious difference with real-world datasets impacts the generalization. Little work has been done to bridge the gap between two domains over their wide difference. In this paper, we propose a domain adaptation method called Adaptive Wasserstein Hourglass (AW Hourglass) for weakly-supervised 3D hand pose estimation, which aims to distinguish the difference and explore the common characteristics (e.g. hand structure) of synthetic and real-world datasets. Learning the common characteristics helps the network focus on pose-related information. The similarity of the characteristics makes it easier to enforce domain-invariant constraints. During training, based on the relation between these common characteristics and 3D pose learned from fully-annotated synthetic datasets, it is beneficial for the network to restore the 3D pose of weakly labeled real-world datasets with the aid of 2D annotations and depth images. While in testing, the network predicts the 3D pose with the input of RGB.

dataset, neural network, video understanding, (17 more...)

arXiv.org Artificial Intelligence

1909.05666

Country:

Asia > China (0.14)
Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.85)

Add feedback

Estimation of Individualized Decision Rules Based on an Optimized Covariate-Dependent Equivalent of Random Outcomes

Qi, Zhengling, Cui, Ying, Liu, Yufeng, Pang, Jong-Shi

arXiv.org Machine LearningAug-27-2019

Recent exploration of optimal individualized decision rul es (IDRs) for patients in precision medicine has attracted a lot of attention due to th e heterogeneous responses of patients to different treatments. In the existing literature of precisi on medicine, an optimal IDR is defined as a decision function mapping from the patients' covariate spa ce into the treatment space that maximizes the expected outcome of each individual. Motivated by the co ncept of Optimized Certainty Equivalent (OCE) introduced originally in [2] that includes the p opular conditional-value-of risk (CVaR) [21], we propose a decision-rule based optimized covariate s dependent equivalent (CDE) for individualized decision making problems. Our proposed IDR-CDE bro adens the existing expected-mean outcome framework in precision medicine and enriches the pr evious concept of the OCE. Under a functional margin description of the decision rule modeled by an indicator function as in the literature of large-margin classifiers, we study the mathematical problem of estimating an optimal IDRs in two cases: in one case, an optimal solution can be obtained "explicitly" that involves the implicit evaluation of an OCE; the other case requires the numerical s olution of an empirical minimization problem obtained by sampling the underlying distributions of the random variables involved.

null, oncology, optimization problem, (17 more...)

arXiv.org Machine Learning

1908.10742

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback