Goto

Collaborating Authors

 xgboost


Assessing the Operational Viability of Foundation Models for Time Series Forecasting

arXiv.org Machine Learning

Time series forecasting drives operational decisions in areas like finance, transportation, and energy. While supervised learning approaches achieve strong performance, they require domain-specific training, feature engineering, and ongoing maintenance. Large-scale foundation models have recently emerged as a zero-shot alternative, avoiding task-specific training much like LLMs. In this work, we evaluate foundation models against standard supervised approaches. Rather than focusing solely on aggregate accuracy, we analyze performance across four operational regimes: periodic human-centric systems, physically constrained processes, stochastic financial markets, and heterogeneous demand forecasting. Our results characterize optimal deployment areas. Foundation models perform well in domains with transferable periodic structures and are efficient for cold-start or long-tail scenarios. Conversely, supervised specialists maintain higher precision in systems governed by strict physical constraints. In financial domains, newer foundation models are rapidly closing the performance gap with supervised specialists. We further quantify trade-offs in inference latency, data drift adaptability, and deployment constraints. Finally, we propose a Complexity Router that assigns each series to the optimal model class using empirical features. We demonstrate that this selective routing achieves higher accuracy and significantly lower inference costs compared to deploying a universal foundation model, providing a practical framework for balancing generalization and efficiency.


Proxy-Based Approximation of Shapley and Banzhaf Interactions

arXiv.org Machine Learning

Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.


Forecasting Oncology Demand Trends with Boosting-Based Bayesian Conjugate Models

arXiv.org Machine Learning

Accurate trend forecasting in healthcare time series is essential for planning and resource allocation. This paper proposes a Bayesian framework for predicting oncology demand trends, modeling weekly appointments as a Poisson process with a Gamma prior to the demand rate. To enhance adaptability and capture persistent directional patterns, we incorporate a residual-based boosting mechanism grounded in a Gamma-Log-Normal conjugate structure. This boosting approach allows the model to track both short- and long-term trend shifts while maintaining the analytical tractability of conjugate Bayesian updating. The methodology was evaluated on real oncology service data from Cariri, Ceara, Brazil, and compared against established baselines, including linear regression, ARIMA, naive forecasting, LSTM neural networks, and XGBoost. Results showed that the proposed model outperforms competing methods in trend detection accuracy, with gains in terms of percentage of correct direction of 38.25% in relation to the second best approach in some cases.


Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints

arXiv.org Machine Learning

We study the problem of routing queries to large language models (LLMs) under cost, GPU resources, and concurrency constraints. Prior per-query routing methods often fail to control batch-level cost, especially under non-uniform or adversarial batching. To address this, we propose a batch-level, resource-aware routing framework that jointly optimizes model assignment for each batch while respecting cost and model capacity limits. We further introduce a robust variant that accounts for uncertainty in predicted LLM performance, along with an offline instance allocation procedure that balances quality and throughput across multiple models. Experiments on two multi-task LLM benchmarks show that robustness improves accuracy by 1-14% over non-robust counterparts (depending on the performance estimator), batch-level routing outperforms per-query methods by up to 24% under adversarial batching, and optimized instance allocation yields additional gains of up to 3% compared to a non-optimized allocation, all while strictly controlling cost and GPU resource constraints.


RFX-Fuse: Breiman and Cutler's Unified ML Engine + Native Explainable Similarity

arXiv.org Machine Learning

Breiman and Cutler's original Random Forest was designed as a unified ML engine -- not merely an ensemble predictor. Their implementation included classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization -- capabilities that modern libraries like scikit-learn never implemented. RFX-Fuse (Random Forests X [X=compression] -- Forest Unified Learning and Similarity Engine) delivers Breiman and Cutler's complete vision with native GPU/CPU support. Modern ML pipelines require 5+ separate tools -- XGBoost for prediction, FAISS for similarity, SHAP for explanations, Isolation Forest for outliers, custom code for importance. RFX-Fuse provides a 1 to 2 model object alternative -- a single set of trees grown once. Novel Contributions: (1) Proximity Importance -- native explainable similarity: proximity measures that samples are similar; proximity importance explains why. (2) Dataset-specific imputation validation for general tabular data -- ranking imputation methods by how real the imputed data looks, without ground truth labels.



Supplementary Document

Neural Information Processing Systems

The pseudo-code of plugging our method into the vanilla BO is summarised in Algorithm 1. Therefore, our method is applicable to any other variants of BO in a plug-in manner. In this section, we present the proofs associated with the theoretical assertions from Section 2. To Lemma 1. Assume the GP employs a stationary kernel Lemma 2. Given Lemma 1, determining Proposition 2. Leveraging Lemma 2, suppose Lemma 3. As per Srinivas et al., the optimization process in BO can be conceptualized as a sampling Pr null |f ( x) µ(x) | ωσ ( x) null > δ, (24) where δ > 0 signifies the confidence level adhered to by the UCB. This lemma is directly from Srinivas et al. . The proof can be found therein. Theorem 1. Leveraging Corollary 1, when employing the termination method proposed in this paper, As discussed in Remark 2 of Section 2.2 in the main manuscript, we suggest initializing L-BFGS Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's



Well-tunedSimpleNetsExcelon TabularDatasets

Neural Information Processing Systems

Weempirically assess theimpact oftheseregularization cocktailsforMLPs ina large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditionalMLmethods,suchasXGBoost.