Goto

Collaborating Authors

 expit


A Censored Transformed Model for Proportional Outcomes with Boundary Mass and an Application to Loss Given Default Modeling

arXiv.org Machine Learning

We introduce the zero-one censored transformed normal (ZOC-TN) model for proportional responses with potential probability mass at the boundaries 0 and 1. The model combines a censored Gaussian variable with a two-parameter affine-logit transformation on the interior (0,1). We characterize the transformation parameters, establish large-sample properties, and relate the affine-logit specification to broader classes of interior distributions. Theoretical and experimental results demonstrate that the proposed model can capture a wider range of qualitative density shapes than several benchmark models while remaining parsimonious, computationally efficient, and numerically stable. Furthermore, the ZOC-TN model can be extended (i) to account for nonlinearities and interactions in a tree-boosting machine learning framework and (ii) to explicitly model residual spatio-temporal variability. We apply the ZOC-TN model to loss given default (LGD) modeling for a large dataset of U.S. residential mortgages and compare it to multiple benchmark models. We find that a tree-boosted ZOC-TN model with a spatio-temporal frailty Gaussian process delivers the strongest out-of-sample performance, indicating that mortgage losses are shaped by nonlinear covariate effects and by unaccounted-for space-time variation.


Network Causal Effect Estimation In Graphical Models Of Contagion And Latent Confounding

arXiv.org Machine Learning

A key question in many network studies is whether the observed correlations between units are primarily due to contagion or latent confounding. Here, we study this question using a segregated graph (Shpitser, 2015) representation of these mechanisms, and examine how uncertainty about the true underlying mechanism impacts downstream computation of network causal effects, particularly under full interference -- settings where we only have a single realization of a network and each unit may depend on any other unit in the network. Under certain assumptions about asymptotic growth of the network, we derive likelihood ratio tests that can be used to identify whether different sets of variables -- confounders, treatments, and outcomes -- across units exhibit dependence due to contagion or latent confounding. We then propose network causal effect estimation strategies that provide unbiased and consistent estimates if the dependence mechanisms are either known or correctly inferred using our proposed tests. Together, the proposed methods allow network effect estimation in a wider range of full interference scenarios that have not been considered in prior work. We evaluate the effectiveness of our methods with synthetic data and the validity of our assumptions using real-world networks.


Parametrization Cookbook: A set of Bijective Parametrizations for using Machine Learning methods in Statistical Inference

arXiv.org Machine Learning

We present in this paper a way to transform a constrained statistical inference problem into an unconstrained one in order to be able to use modern computational methods, such as those based on automatic differentiation, GPU computing, stochastic gradients with mini-batch. Unlike the parametrizations classically used in Machine Learning, the parametrizations introduced here are all bijective and are even diffeomorphisms, thus allowing to keep the important properties from a statistical inference point of view, first of all identifiability. This cookbook presents a set of recipes to use to transform a constrained problem into a unconstrained one. For an easy use of parametrizations, this paper is at the same time a cookbook, and a Python package allowing the use of parametrizations with numpy, but also JAX and PyTorch, as well as a high level and expressive interface allowing to easily describe a parametrization to transform a difficult problem of statistical inference into an easier problem addressable with modern optimization tools.


Estimating Treatment Effect under Additive Hazards Models with High-dimensional Covariates

arXiv.org Machine Learning

Estimating causal effects for survival outcomes in the high-dimensional setting is an extremely important topic for many biomedical applications as well as areas of social sciences. We propose a new orthogonal score method for treatment effect estimation and inference that results in asymptotically valid confidence intervals assuming only good estimation properties of the hazard outcome model and the conditional probability of treatment. This guarantee allows us to provide valid inference for the conditional treatment effect under the high-dimensional additive hazards model under considerably more generality than existing approaches. In addition, we develop a new Hazards Difference (HDi), estimator. We showcase that our approach has double-robustness properties in high dimensions: with cross-fitting, the HDi estimate is consistent under a wide variety of treatment assignment models; the HDi estimate is also consistent when the hazards model is misspecified and instead the true data generating mechanism follows a partially linear additive hazards model. We further develop a novel sparsity doubly robust result, where either the outcome or the treatment model can be a fully dense high-dimensional model. We apply our methods to study the treatment effect of radical prostatectomy versus conservative management for prostate cancer patients using the SEER-Medicare Linked Data.


Estimation and Optimization of Composite Outcomes

arXiv.org Machine Learning

There is tremendous interest in precision medicine as a means to improve patient outcomes by tailoring treatment to individual characteristics. An individualized treatment rule formalizes precision medicine as a map from patient information to a recommended treatment. A treatment rule is defined to be optimal if it maximizes the mean of a scalar outcome in a population of interest, e.g., symptom reduction. However, clinical and intervention scientists often must balance multiple and possibly competing outcomes, e.g., symptom reduction and the risk of an adverse event. One approach to precision medicine in this setting is to elicit a composite outcome which balances all competing outcomes; unfortunately, eliciting a composite outcome directly from patients is difficult without a high-quality instrument, and an expert-derived composite outcome may not account for heterogeneity in patient preferences. We propose a new paradigm for the study of precision medicine using observational data that relies solely on the assumption that clinicians are approximately (i.e., imperfectly) making decisions to maximize individual patient utility. Estimated composite outcomes are subsequently used to construct an estimator of an individualized treatment rule which maximizes the mean of patient-specific composite outcomes. The estimated composite outcomes and estimated optimal individualized treatment rule provide new insights into patient preference heterogeneity, clinician behavior, and the value of precision medicine in a given domain. We derive inference procedures for the proposed estimators under mild conditions and demonstrate their finite sample performance through a suite of simulation experiments and an illustrative application to data from a study of bipolar depression.


Weighted Orthogonal Components Regression Analysis

arXiv.org Machine Learning

In the multiple linear regression setting, we propose a general framework, termed weighted orthogonal components regression (WOCR), which encompasses many known methods as special cases, including ridge regression and principal components regression. WOCR makes use of the monotonicity inherent in orthogonal components to parameterize the weight function. The formulation allows for efficient determination of tuning parameters and hence is computationally advantageous. Moreover, WOCR offers insights for deriving new better variants. Specifically, we advocate weighting components based on their correlations with the response, which leads to enhanced predictive performance. Both simulated studies and real data examples are provided to assess and illustrate the advantages of the proposed methods.