Goto

Collaborating Authors

 ecv


Saddle Networks: Structure-Preserving Architectures for Convex-Concave Functions

arXiv.org Machine Learning

Saddle-point models arise throughout optimization, optimal transport, robust learning, and control. In many applications, the relevant function f(x,y) is convex in x and concave in y, and preserving this geometry is essential for obtaining tractable min--max formulations and reliable certificates. We introduce a structured separable decomposition that preserves the convex-concave geometry and prove a complete one-dimensional approximation theorem under a mixed Monge-type convexity condition. We then describe practical saddle network architectures that preserve convexity in x and concavity in y by construction. The proposed architectures require only convexity-preserving neural networks, together with simple output transformations enforcing sign and concavity constraints. Finally, we report numerical benchmarks in dimension 1 and 5, showing that the proposed saddle networks achieve high accuracy on smooth, nonsmooth, and high-rank convex--concave test functions.


Identifying and Improving Disability Bias in GAI-Based Resume Screening

arXiv.org Artificial Intelligence

As Generative AI rises in adoption, its use has expanded to include domains such as hiring and recruiting. However, without examining the potential of bias, this may negatively impact marginalized populations, including people with disabilities. To address this important concern, we present a resume audit study, in which we ask ChatGPT (specifically, GPT-4) to rank a resume against the same resume enhanced with an additional leadership award, scholarship, panel presentation, and membership that are disability related. We find that GPT-4 exhibits prejudice towards these enhanced CVs. Further, we show that this prejudice can be quantifiably reduced by training a custom GPTs on principles of DEI and disability justice. Our study also includes a unique qualitative analysis of the types of direct and indirect ableism GPT-4 uses to justify its biased decisions and suggest directions for additional bias mitigation work. Additionally, since these justifications are presumably drawn from training data containing real-world biased statements made by humans, our analysis suggests additional avenues for understanding and addressing human bias.


Extrapolated cross-validation for randomized ensembles

arXiv.org Machine Learning

Ensemble methods such as bagging and random forests are ubiquitous in various fields, from finance to genomics. Despite their prevalence, the question of the efficient tuning of ensemble parameters has received relatively little attention. This paper introduces a cross-validation method, ECV (Extrapolated Cross-Validation), for tuning the ensemble and subsample sizes in randomized ensembles. Our method builds on two primary ingredients: initial estimators for small ensemble sizes using out-of-bag errors and a novel risk extrapolation technique that leverages the structure of prediction risk decomposition. By establishing uniform consistency of our risk extrapolation technique over ensemble and subsample sizes, we show that ECV yields $\delta$-optimal (with respect to the oracle-tuned risk) ensembles for squared prediction risk. Our theory accommodates general ensemble predictors, only requires mild moment assumptions, and allows for high-dimensional regimes where the feature dimension grows with the sample size. As a practical case study, we employ ECV to predict surface protein abundances from gene expressions in single-cell multiomics using random forests. In comparison to sample-split cross-validation and $K$-fold cross-validation, ECV achieves higher accuracy avoiding sample splitting. At the same time, its computational cost is considerably lower owing to the use of the risk extrapolation technique. Additional numerical results validate the finite-sample accuracy of ECV for several common ensemble predictors under a computational constraint on the maximum ensemble size.


A Deep Learning Segmentation Pipeline for Cardiac T1 Mapping Using MRI Relaxation–based Synthetic Contrast Augmentation

#artificialintelligence

Cardiac MRI relaxometry is clinically used to quantitatively characterize various cardiovascular conditions, such as myocardial infarction (1), myocarditis (2), amyloidosis (3), and cardiomyopathy (4). Single-slice and multislice short-axis myocardial T1-mapping protocols have enabled quantification of global and local tissue alterations, including edema and fibrosis, across these pathologic states (5). Furthermore, precontrast (native) T1 (T1native) and postcontrast T1 (T1post) mapping can be combined to provide an estimate of extracellular volume (ECV). Clinically, T1 and ECV can be used to differentiate cardiac abnormality and potentially grade disease severity and risk stratification (6,7). Currently, the measurement of segmental myocardial T1 and ECV requires manual delineation of the left ventricle (LV) myocardium, LV blood pool, and right ventricular (RV) insertion point (RVIP) in both native and postcontrast T1 maps.


Network cross-validation by edge sampling

arXiv.org Machine Learning

Statistical methods for network data have received a lot of attention because of the wideranging applications of network analysis. There is now a large body of work on methods and models for networks, including the stochastic block model (SBM) [Holland et al., 1983], the degree-corrected stochastic block model (DCSBM) [Karrer and Newman, 2011], and the latent space model [Hoff et al., 2002], to name a few. While this gives the practitioner plenty of choices, there is a lot less work on the crucial question of how to select the best model for the data, as well as how to choose tuning parameters for the selected model, which is often necessary in order to fit it. In some specific problems, progress has been made recently, for instance, in the much-studied problem of community detection. Community detection is the problem of clustering network nodes into groups, and most of the methods proposed over the last twenty years or so require the number of communities K as input.