Goto

Collaborating Authors

 itr


Deep Optimal Individualized Treatment Rules for Bivariate Survival Outcomes via Adaptive Prediction-Powered Learning

arXiv.org Machine Learning

In randomized trials involving multiple treatments, bivariate survival outcomes present significant analytical challenges for making decisions. This paper addresses the problem of deriving optimal individualized treatment rules to maximize the joint survival probability beyond fixed time points $(t_1, t_2)$ through deep neural networks, while accounting for right censoring. We propose a novel approach that models treatment rules via stochastic policies, coupling marginal accelerated failure time models via link function to capture bivariate dependence. To enhance robustness and effectiveness of decision making, we introduce an adaptive prediction-powered method that leverages auxiliary predictions from machine learning models.



An effective framework for estimating individualized treatment rules

Neural Information Processing Systems

Estimating individualized treatment rules (ITRs) is fundamental in causal inference, particularly for precision medicine applications. Traditional ITR estimation methods rely on inverse probability weighting (IPW) to address confounding factors and $L_{1}$-penalization for simplicity and interpretability. However, IPW can introduce statistical bias without precise propensity score modeling, while $L_1$-penalization makes the objective non-smooth, leading to computational bias and requiring subgradient methods. In this paper, we propose a unified ITR estimation framework formulated as a constrained, weighted, and smooth convex optimization problem. The optimal ITR can be robustly and effectively computed by projected gradient descent. Our comprehensive theoretical analysis reveals that weights that balance the spectrum of a `weighted design matrix' improve both the optimization and likelihood landscapes, yielding improved computational and statistical estimation guarantees. In particular, this is achieved by distributional covariate balancing weights, which are model-free alternatives to IPW. Extensive simulations and applications demonstrate that our framework achieves significant gains in both robustness and effectiveness for ITR learning against existing methods.








Iteratively Learn Diverse Strategies with State Distance Information

Neural Information Processing Systems

In complex reinforcement learning (RL) problems, policies with similar rewards may have substantially different behaviors. It remains a fundamental challenge to optimize rewards while also discovering as many strategies as possible, which can be crucial in many practical applications. Our study examines two design choices for tackling this challenge, i.e., and . First, we find that with existing diversity measures, visually indistinguishable policies can still yield high diversity scores. To accurately capture the behavioral difference, we propose to incorporate the state-space distance information into the diversity measure.