Goto

Collaborating Authors

 Stoica, Petre


Certified Inventory Control of Critical Resources

arXiv.org Machine Learning

Inventory control using discrete-time models is a wellstudied problem, where orders of items to hold in stock must anticipate future demand [1, 2]. By defining the costs of insufficient stocks, it is possible to find cost-minimizing policies using dynamic programming [3, 4, 5]. In practice, however, maintaining a certain service level of an inventory control system is a greater priority than cost minimization [6, 7]. Under certain restrictive assumptions on the demand process - such as memoryless and identically distributed demand - there are explicit formulations of the duality between service levels and costs [8].


Diagnostic Tool for Out-of-Sample Model Evaluation

arXiv.org Machine Learning

Assessment of model fitness is a key part of machine learning. The standard paradigm of model evaluation is analysis of the average loss over future data. This is often explicit in model fitting, where we select models that minimize the average loss over training data as a surrogate, but comes with limited theoretical guarantees. In this paper, we consider the problem of characterizing a batch of out-of-sample losses of a model using a calibration data set. We provide finite-sample limits on the out-of-sample losses that are statistically valid under quite general conditions and propose a diagonistic tool that is simple to compute and interpret. Several numerical experiments are presented to show how the proposed method quantifies the impact of distribution shifts, aids the analysis of regression, and enables model selection as well as hyperparameter tuning.


Off-Policy Evaluation with Out-of-Sample Guarantees

arXiv.org Artificial Intelligence

We consider the problem of evaluating the performance of a decision policy using past observational data. The outcome of a policy is measured in terms of a loss (aka. disutility or negative reward) and the main problem is making valid inferences about its out-of-sample loss when the past data was observed under a different and possibly unknown policy. Using a sample-splitting method, we show that it is possible to draw such inferences with finite-sample coverage guarantees about the entire loss distribution, rather than just its mean. Importantly, the method takes into account model misspecifications of the past policy - including unmeasured confounding. The evaluation method can be used to certify the performance of a policy using observational data under a specified range of credible model assumptions.


Fair principal component analysis (PCA): minorization-maximization algorithms for Fair PCA, Fair Robust PCA and Fair Sparse PCA

arXiv.org Machine Learning

In this paper we propose a new iterative algorithm to solve the fair PCA (FPCA) problem. We start with the max-min fair PCA formulation originally proposed in [1] and derive a simple and efficient iterative algorithm which is based on the minorization-maximization (MM) approach. The proposed algorithm relies on the relaxation of a semi-orthogonality constraint which is proved to be tight at every iteration of the algorithm. The vanilla version of the proposed algorithm requires solving a semi-definite program (SDP) at every iteration, which can be further simplified to a quadratic program by formulating the dual of the surrogate maximization problem. We also propose two important reformulations of the fair PCA problem: a) fair robust PCA -- which can handle outliers in the data, and b) fair sparse PCA -- which can enforce sparsity on the estimated fair principal components. The proposed algorithms are computationally efficient and monotonically increase their respective design objectives at every iteration. An added feature of the proposed algorithms is that they do not require the selection of any hyperparameter (except for the fair sparse PCA case where a penalty parameter that controls the sparsity has to be chosen by the user). We numerically compare the performance of the proposed methods with two of the state-of-the-art approaches on synthetic data sets and a real-life data set.


Pearson-Matthews correlation coefficients for binary and multinary classification and hypothesis testing

arXiv.org Machine Learning

The Pearson-Matthews correlation coefficient (usually abbreviated MCC) is considered to be one of the most useful metrics for the performance of a binary classification or hypothesis testing method (for the sake of conciseness we will use the classification terminology throughout, but the concepts and methods discussed in the paper apply verbatim to hypothesis testing as well). For multinary classification tasks (with more than two classes) the existing extension of MCC, commonly called the $\text{R}_{\text{K}}$ metric, has also been successfully used in many applications. The present paper begins with an introductory discussion on certain aspects of MCC. Then we go on to discuss the topic of multinary classification that is the main focus of this paper and which, despite its practical and theoretical importance, appears to be less developed than the topic of binary classification. Our discussion of the $\text{R}_{\text{K}}$ is followed by the introduction of two other metrics for multinary classification derived from the multivariate Pearson correlation (MPC) coefficients. We show that both $\text{R}_{\text{K}}$ and the MPC metrics suffer from the problem of not decisively indicating poor classification results when they should, and introduce three new enhanced metrics that do not suffer from this problem. We also present an additional new metric for multinary classification which can be viewed as a direct extension of MCC.


Tuned Regularized Estimators for Linear Regression via Covariance Fitting

arXiv.org Machine Learning

We consider the problem of finding tuned regularized parameter estimators for linear models. We start by showing that three known optimal linear estimators belong to a wider class of estimators that can be formulated as a solution to a weighted and constrained minimization problem. The optimal weights, however, are typically unknown in many applications. This begs the question, how should we choose the weights using only the data? We propose using the covariance fitting SPICE-methodology to obtain data-adaptive weights and show that the resulting class of estimators yields tuned versions of known regularized estimators - such as ridge regression, LASSO, and regularized least absolute deviation. These theoretical results unify several important estimators under a common umbrella. The resulting tuned estimators are also shown to be practically relevant by means of a number of numerical examples.


Learning Pareto-Efficient Decisions with Confidence

arXiv.org Machine Learning

The paper considers the problem of multi-objective decision support when outcomes are uncertain. We extend the concept of Pareto-efficient decisions to take into account the uncertainty of decision outcomes across varying contexts. This enables quantifying trade-offs between decisions in terms of tail outcomes that are relevant in safety-critical applications. We propose a method for learning efficient decisions with statistical confidence, building on results from the conformal prediction literature. The method adapts to weak or nonexistent context covariate overlap and its statistical guarantees are evaluated using both synthetic and real data.


Distributionally Robust Learning in Heterogeneous Contexts

arXiv.org Machine Learning

We consider the problem of learning from training data obtained in different contexts, where the test data is subject to distributional shifts. We develop a distributionally robust method that focuses on excess risks and achieves a more appropriate trade-off between performance and robustness than the conventional and overly conservative minimax approach. The proposed method is computationally feasible and provides statistical guarantees. We demonstrate its performance using both real and synthetic data.


Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

arXiv.org Machine Learning

Spatial point processes can be found in a range of applications from astronomy and biology to ecology and criminology. These processes can be characterized by a nonnegative intensity function λpxq which predicts the number of events that occur across space parameterized byxPX [8, 4]. A standard approach to estimate the intensity function of a process is to use nonparametric kernel density-based methods [6, 7]. These smoothing techniques require, however, careful tuning of kernel bandwidth parameters and are, more importantly, subject to selection biases. That is, in regions where no events have been observed, the intensity is inferred to be zero and no measure is readily available for a user to assess the uncertainty of such predictions. More advanced methods infer the intensity by assuming a parameterized model of the data-generating process, such as inhomogeneous Poisson point process models.


Effect Inference from Two-Group Data with Sampling Bias

arXiv.org Machine Learning

In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we develop an inference method that is resilient to sampling biases and is able to control the false positive errors under moderate bias levels in contrast to the standard approach. We demonstrate the method using synthetic and real biomarker data.