Goto

Collaborating Authors

 Regression


BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain

Neural Information Processing Systems

Nonparametric regression on complex domains has been a challenging task as most existing methods, such as ensemble models based on binary decision trees, are not designed to account for intrinsic geometries and domain boundaries. This article proposes a Bayesian additive regression spanning trees (BAST) model for nonparametric regression on manifolds, with an emphasis on complex constrained domains or irregularly shaped spaces embedded in Euclidean spaces. Our model is built upon a random spanning tree manifold partition model as each weak learner, which is capable of capturing any irregularly shaped spatially contiguous partitions while respecting intrinsic geometries and domain boundary constraints. Equipped with a soft prediction scheme, BAST is demonstrated to significantly outperform other competing methods in simulation experiments and in an application to the chlorophyll data in Aral Sea, due to its strong local adaptivity to different levels of smoothness.


Reviews: High Dimensional Linear Regression using Lattice Basis Reduction

Neural Information Processing Systems

The paper presents a novel method of exactly recovering a vector of coefficients in high-dimensional linear regression, with high probability as the dimension goes to infinity. The method assumes that the correct coefficients come from a finite discrete set of bounded rational values, but it does not - as is commonplace - assume that the coefficient vector is sparse. To achieve this, the authors extend a classical algorithm for lattice basis reduction. Crucially, this approach does not require the sample size to grow with the dimension, thus in certain cases the algorithm is able to recover the exact coefficient vector from just a single sample (with the dimension sufficiently large). A novel connection between high-dimensional linear regression and lattice basis reduction is the main strength of the paper.


Reviews: Efficient Sublinear-Regret Algorithms for Online Sparse Linear Regression with Limited Observation

Neural Information Processing Systems

The paper considers the online sparse regression problem introduced by Kale (COLT'14), in which the online algorithm can only observe a subset of k features of each data point and has to sequentially predict a label based only on this limited observation (it can thus only use a sparse predictor for each prediction). Without further assumptions, this problem has been recently shown to be computationally hard by Foster et al (ALT'16). To circumvent this hardness, the authors assume a stochastic i.i.d. The results are not particularly exciting, but they do give a nice counter to the recent computational impossibility of Foster et al, in a setting where the data is i.i.d. and well-specified by a k-sparse vector. One of the main things I was missing in the paper is a proper discussion relating its setup, assumptions and results to the literature on sparse recovery / compressed sensing / sparse linear regression.


Reviews: Horizon-Independent Minimax Linear Regression

Neural Information Processing Systems

The problem of online linear regression is considered from an individual sequence perspective, where the aim is to control the square loss predictive regret with respect to the best linear predictor \theta \top x_t simultaneously for every sequence of covariate vectors x_t \in R d and outcomes y_t \in R in some constraint set. This is naturally formulated as a sequential game between the forecaster and an adversarial environment. In previous work [1], this problem was addressed in the "fixed-design" case, where the horizon T and the sequence of covariate vectors x_1 T is known in advance. The exact minimax strategy (MMS) was introduced and shown to be minimax optimal under natural constraint sets on the label sequence (such as ellipse-constrained labels). The MMS strategy consists in some form of least squares, but where the inverse cumulative covariance matrix \Pi_t {-1} is replaced by a shrunk version P_t that takes future instance into account.


Systematic Feature Design for Cycle Life Prediction of Lithium-Ion Batteries During Formation

arXiv.org Artificial Intelligence

Accurate lifetime prediction of lithium-ion batteries accelerates battery optimization and improves safety [1-4]. Although this task is challenging due to complicated and convolved degradation mechanisms, various studies have demonstrated the potential in using data-driven approaches [5-13], physics-based approaches [14-18], and hybrid approaches [19-26]. For accurate battery health monitoring, diagnostic techniques such as Differential Voltage Fitting (DVF) [27-30], Incremental Capacity Analysis (ICA) [31, 32], Electrochemical Impedance Spectroscopy (EIS) [10, 33-35], and Hybrid Pulse Power Characterization (HPPC) [36, 37] were developed for physics-based feature extraction during battery operation. Further optimization of these diagnostic techniques includes novel State of Health (SoH) feature development [38-41] and diagnostic time reduction [42, 43]. Compared to the extensive research on lifetime prediction during operation, there have been few studies on lifetime prediction during the manufacturing process (i.e., extreme early cycle life prediction) because of the limited availability of public manufacturing data. In fact, the cycle life can vary greatly based on the protocol used during formation, in which a passivation layer of Solid Electrolyte Interphase (SEI) is rapidly formed on the anode to limit further degradation during use. For example, Weng et al. [44] showed that the Nickel Manganese Cobalt (NMC)/graphite pouch cells with the fast formation protocol proposed by Wood et al. [45, 46] had in average 25% longer cycle lives than the pouch cells with a baseline formation protocol when aging the cells in both room temperature and high-temperature (45


Distributionally Robust Clustered Federated Learning: A Case Study in Healthcare

arXiv.org Artificial Intelligence

In this paper, we address the challenge of heterogeneous data distributions in cross-silo federated learning by introducing a novel algorithm, which we term Cross-silo Robust Clustered Federated Learning (CS-RCFL). Our approach leverages the Wasserstein distance to construct ambiguity sets around each client's empirical distribution that capture possible distribution shifts in the local data, enabling evaluation of worst-case model performance. We then propose a model-agnostic integer fractional program to determine the optimal distributionally robust clustering of clients into coalitions so that possible biases in the local models caused by statistically heterogeneous client datasets are avoided, and analyze our method for linear and logistic regression models. Finally, we discuss a federated learning protocol that ensures the privacy of client distributions, a critical consideration, for instance, when clients are healthcare institutions. We evaluate our algorithm on synthetic and real-world healthcare data.


Shap-Select: Lightweight Feature Selection Using SHAP Values and Regression

arXiv.org Artificial Intelligence

Feature selection is an essential process in machine learning, especially when dealing with high-dimensional datasets. It helps reduce the complexity of machine learning models, improve performance, mitigate overfitting, and decrease computation time. This paper presents a novel feature selection framework, shap-select. The framework conducts a linear or logistic regression of the target on the Shapley values of the features, on the validation set, and uses the signs and significance levels of the regression coefficients to implement an efficient heuristic for feature selection in tabular regression and classification tasks. We evaluate shap-select on the Kaggle credit card fraud dataset, demonstrating its effectiveness compared to established methods such as Recursive Feature Elimination (RFE), HISEL (a mutual information-based feature selection method), Boruta and a simpler Shapley value-based method. Our findings show that shap-select combines interpretability, computational efficiency, and performance, offering a robust solution for feature selection.


Efficient inference for time-varying behavior during learning

Neural Information Processing Systems

The process of learning new behaviors over time is a problem of great interest in both neuroscience and artificial intelligence. However, most standard analyses of animal training data either treat behavior as fixed or track only coarse performance statistics (e.g., accuracy, bias), providing limited insight into the evolution of the policies governing behavior. To overcome these limitations, we propose a dynamic psychophysical model that efficiently tracks trial-to-trial changes in behavior over the course of training. Our model consists of a dynamic logistic regression model, parametrized by a set of time-varying weights that express dependence on sensory stimuli as well as task-irrelevant covariates, such as stimulus, choice, and answer history. To illustrate performance, we apply our method to psychophysical data from both rats and human subjects learning a delayed sensory discrimination task.


Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with \beta -Divergences

Neural Information Processing Systems

We present the very first robust Bayesian Online Changepoint Detection algorithm through General Bayesian Inference (GBI) with \beta -divergences. The resulting inference procedure is doubly robust for both the predictive and the changepoint (CP) posterior, with linear time and constant space complexity. We provide a construction for exponential models and demonstrate it on the Bayesian Linear Regression model. In so doing, we make two additional contributions: Firstly, we make GBI scalable using Structural Variational approximations that are exact as \beta \to 0 . Secondly, we give a principled way of choosing the divergence parameter \beta by minimizing expected predictive loss on-line.


Horizon-Independent Minimax Linear Regression

Neural Information Processing Systems

We consider online linear regression: at each round, an adversary reveals a covariate vector, the learner predicts a real value, the adversary reveals a label, and the learner suffers the squared prediction error. The aim is to minimize the difference between the cumulative loss and that of the linear predictor that is best in hindsight. Previous work demonstrated that the minimax optimal strategy is easy to compute recursively from the end of the game; this requires the entire sequence of covariate vectors in advance. We show that, once provided with a measure of the scale of the problem, we can invert the recursion and play the minimax strategy without knowing the future covariates. Further, we show that this forward recursion remains optimal even against adaptively chosen labels and covariates, provided that the adversary adheres to a set of constraints that prevent misrepresentation of the scale of the problem.