Goto

Collaborating Authors

 predictive performance







The Powers of Precision: Structure-Informed Detection in Complex Systems -- From Customer Churn to Seizure Onset

Santos, Augusto, Santos, Teresa, Rodrigues, Catarina, Moura, José M. F.

arXiv.org Machine Learning

Emergent phenomena -- onset of epileptic seizures, sudden customer churn, or pandemic outbreaks -- often arise from hidden causal interactions in complex systems. We propose a machine learning method for their early detection that addresses a core challenge: unveiling and harnessing a system's latent causal structure despite the data-generating process being unknown and partially observed. The method learns an optimal feature representation from a one-parameter family of estimators -- powers of the empirical covariance or precision matrix -- offering a principled way to tune in to the underlying structure driving the emergence of critical events. A supervised learning module then classifies the learned representation. We prove structural consistency of the family and demonstrate the empirical soundness of our approach on seizure detection and churn prediction, attaining competitive results in both. Beyond prediction, and toward explainability, we ascertain that the optimal covariance power exhibits evidence of good identifiability while capturing structural signatures, thus reconciling predictive performance with interpretable statistical structure.


How Well Do LLMs Predict Human Behavior? A Measure of their Pretrained Knowledge

Gao, Wayne, Han, Sukjin, Liang, Annie

arXiv.org Machine Learning

Large language models (LLMs) are increasingly used in economics as predictive tools--both to generate synthetic responses in place of human subjects (Horton, 2023; Anthis et al., 2025), and to forecast economic outcomes directly (Hewitt et al., 2024a; Faria-e Castro and Leibovici, 2024; Chan-Lau et al., 2025). Their appeal in these roles is obvious: A pretrained LLM embeds a vast amount of information and can be deployed at negligible cost, often in settings where collecting new, domain-specific human data would be expensive or infeasible. What remains unclear is how to assess the quality of these predictions. This paper proposes a measure that quantifies the domain-specific value of LLMs in an interpretable unit: the amount of human data they substitute for. Specifically, we ask how much human data would be required for a conventional model trained on that data to match the predictive performance of the pretrained LLM in that domain.


A brief note on learning problem with global perspectives

Befekadu, Getachew K.

arXiv.org Machine Learning

In this brief note, we considers the problem of learning with dynamic-optimizing principal-agent setting, in which the agents are allowed to have global perspectives about the learning process, i.e., the ability to view things according to their relative importances or in their true relations based-on some aggregated information shared by the principal. Whereas, the principal, which is exerting an influence on the learning process of the agents in the aggregation, is primarily tasked to solve a high-level optimization problem posed as an empirical-likelihood estimator under conditional moment restrictions model that also accounts information about the agents' predictive performances on out-of-samples as well as a set of private datasets available only to the principal (e.g., see [1], [2], [3], [4] and [5] for further discussions on empirical likelihood methods with moment restrictions). Here, we provide a coherent mathematical argument which is necessary for characterizing the learning process behind this abstract dynamic-optimizing principal-agent learning framework. Note that, due to the inherent feedbacks behavior among the agents, the proposed learning framework remarkably offers some advantages in terms of stability and consistency, despite that both the principal and the agents do not necessarily need to have any knowledge of the sample distributions or the quality of each others datasets. Finally, it is worth remarking that such a learning framework can provide new insights in the context of collaborative learning problem with global perspectives that exploits the principal-agent setting (e.g., see [6], [7], [8] or [9] for related discussions), although we acknowledge that there are a number of conceptual and theoretical problems, such as small sample properties, still need to be addressed.


Multi-task Modeling for Engineering Applications with Sparse Data

Comlek, Yigitcan, Krishnan, R. Murali, Ravi, Sandipp Krishnan, Moghaddas, Amin, Giorjao, Rafael, Eff, Michael, Samaddar, Anirban, Ramachandra, Nesar S., Madireddy, Sandeep, Wang, Liping

arXiv.org Machine Learning

Modern engineering and scientific workflows frequently require simultaneous prediction across related tasks and fidelity levels [1-6]. In such contexts, some outputs are scarce and expensive to obtain, while others are cheaper and more abundant. Multi-task Gaussian processes (MTGPs), also known as multi-output Gaussian processes, offer a principled Bayesian framework to exploit inter-task correlations, enabling knowledge sharing that improves predictive accuracy and reduces the demand for large high-fidelity datasets [7-9]. Over decades of development, MTGPs have been applied across diverse domains, including time series forecasting, multitask optimization, and multifidelity classification, demonstrating their broad utility wherever data cost asymmetries and cross-task dependencies are present [10-16]. The central motivation for MTGPs is to leverage dependencies among related tasks to enhance predictive quality when high-fidelity information is limited [17]. For example, predicting an airfoil's lift coefficient from limited, expensive high-fidelity computational fluid dynamics (CFD) simulations can benefit from correlating with sufficient low-fidelity simulations [3]. Recent work in joint multi-objective and multifidelity optimization has also utilized MT - GPs to balance exploration and exploitation across tasks, improving predictive performance and decision-making by explicitly modeling relationships among outputs and fidelities [12].


ROOFS: RObust biOmarker Feature Selection

Bakhmach, Anastasiia, Dufossé, Paul, Vaglio, Andrea, Monville, Florence, Greillier, Laurent, Barlési, Fabrice, Benzekry, Sébastien

arXiv.org Machine Learning

Feature selection (FS) is essential for biomarker discovery and in the analysis of biomedical datasets. However, challenges such as high-dimensional feature space, low sample size, multicollinearity, and missing values make FS non-trivial. Moreover, FS performances vary across datasets and predictive tasks. We propose roofs, a Python package available at https://gitlab.inria.fr/compo/roofs, designed to help researchers in the choice of FS method adapted to their problem. Roofs benchmarks multiple FS methods on the user's data and generates reports that summarize a comprehensive set of evaluation metrics, including downstream predictive performance estimated using optimism correction, stability, reliability of individual features, and true positive and false positive rates assessed on semi-synthetic data with a simulated outcome. We demonstrate the utility of roofs on data from the PIONeeR clinical trial, aimed at identifying predictors of resistance to anti-PD-(L)1 immunotherapy in lung cancer. The PIONeeR dataset contained 374 multi-source blood and tumor biomarkers from 435 patients. A reduced subset of 214 features was obtained through iterative variance inflation factor pre-filtering. Of the 34 FS methods gathered in roofs, we evaluated 23 in combination with 11 classifiers (253 models in total) and identified a filter based on the union of Benjamini-Hochberg false discovery rate-adjusted p-values from t-test and logistic regression as the optimal approach, outperforming other methods including the widely used LASSO. We conclude that comprehensive benchmarking with roofs has the potential to improve the robustness and reproducibility of FS discoveries and increase the translational value of clinical models.