Regression
Interpretable, multi-dimensional Evaluation Framework for Causal Discovery from observational i.i.d. Data
Velev, Georg, Lessmann, Stefan
Nonlinear causal discovery from observational data imposes strict identifiability assumptions on the formulation of structural equations utilized in the data generating process. The evaluation of structure learning methods under assumption violations requires a rigorous and interpretable approach, which quantifies both the structural similarity of the estimation with the ground truth and the capacity of the discovered graphs to be used for causal inference. Motivated by the lack of unified performance assessment framework, we introduce an interpretable, six-dimensional evaluation metric, i.e., distance to optimal solution (DOS), which is specifically tailored to the field of causal discovery. Furthermore, this is the first research to assess the performance of structure learning algorithms from seven different families on increasing percentage of non-identifiable, nonlinear causal patterns, inspired by real-world processes. Our large-scale simulation study, which incorporates seven experimental factors, shows that besides causal order-based methods, amortized causal discovery delivers results with comparatively high proximity to the optimal solution.
Kernel Stochastic Configuration Networks for Nonlinear Regression
Stochastic configuration networks (SCNs), as a class of randomized learner models, are featured by its way of random parameters assignment in the light of a supervisory mechanism, resulting in the universal approximation property at algorithmic level. This paper presents a kernel version of SCNs, termed KSCNs, aiming to enhance model's representation learning capability and performance stability. The random bases of a built SCN model can be used to span a reproducing kernel Hilbert space (RKHS), followed by our proposed algorithm for constructing KSCNs. It is shown that the data distribution in the reconstructive space is favorable for regression solving and the proposed KSCN learner models hold the universal approximation property. Three benchmark datasets including two industrial datasets are used in this study for performance evaluation. Experimental results with comparisons against existing solutions clearly demonstrate that the proposed KSCN remarkably outperforms the original SCNs and some typical kernel methods for resolving nonlinear regression problems in terms of the learning performance, the model's stability and robustness with respect to the kernel parameter settings.
Evidential time-to-event prediction with calibrated uncertainty quantification
Huang, Ling, Xing, Yucheng, Mishra, Swapnil, Denoeux, Thierry, Feng, Mengling
Time-to-event analysis provides insights into clinical prognosis and treatment recommendations. However, this task is more challenging than standard regression problems due to the presence of censored observations. Additionally, the lack of confidence assessment, model robustness, and prediction calibration raises concerns about the reliability of predictions. To address these challenges, we propose an evidential regression model specifically designed for time-to-event prediction. The proposed model quantifies both epistemic and aleatory uncertainties using Gaussian Random Fuzzy Numbers and belief functions, providing clinicians with uncertainty-aware survival time predictions. The model is trained by minimizing a generalized negative log-likelihood function accounting for data censoring. Experimental evaluations using simulated datasets with different data distributions and censoring conditions, as well as real-world datasets across diverse clinical applications, demonstrate that our model delivers both accurate and reliable performance, outperforming state-of-the-art methods. These results highlight the potential of our approach for enhancing clinical decision-making in survival analysis.
Finite Sample Analysis of Tensor Decomposition for Learning Mixtures of Linear Systems
We study the problem of learning mixtures of linear dynamical systems (MLDS) from input-output data. This mixture setting allows us to leverage observations from related dynamical systems to improve the estimation of individual models. Building on spectral methods for mixtures of linear regressions, we propose a moment-based estimator that uses tensor decomposition to estimate the impulse response of component models of the mixture. The estimator improves upon existing tensor decomposition approaches for MLDS by utilizing the entire length of the observed trajectories. We provide sample complexity bounds for estimating MLDS in the presence of noise, in terms of both $N$ (number of trajectories) and $T$ (trajectory length), and demonstrate the performance of our estimator through simulations.
Data-Driven Transfer Learning Framework for Estimating Turning Movement Counts
Ma, Xiaobo, Noh, Hyunsoo, Hatch, Ryan, Tokishi, James, Wang, Zepu
Urban transportation networks are vital for the efficient movement of people and goods, necessitating effective traffic management and planning. An integral part of traffic management is understanding the turning movement counts (TMCs) at intersections, Accurate TMCs at intersections are crucial for traffic signal control, congestion mitigation, and road safety. In general, TMCs are obtained using physical sensors installed at intersections, but this approach can be cost-prohibitive and technically challenging, especially for cities with extensive road networks. Recent advancements in machine learning and data-driven approaches have offered promising alternatives for estimating TMCs. Traffic patterns can vary significantly across different intersections due to factors such as road geometry, traffic signal settings, and local driver behaviors. This domain discrepancy limits the generalizability and accuracy of machine learning models when applied to new or unseen intersections. In response to these limitations, this research proposes a novel framework leveraging transfer learning (TL) to estimate TMCs at intersections by using traffic controller event-based data, road infrastructure data, and point-of-interest (POI) data. Evaluated on 30 intersections in Tucson, Arizona, the performance of the proposed TL model was compared with eight state-of-the-art regression models and achieved the lowest values in terms of Mean Absolute Error and Root Mean Square Error.
A Novel Methodology in Credit Spread Prediction Based on Ensemble Learning and Feature Selection
Shao, Yu, Bai, Jiawen, Hou, Yingze, Zhou, Xia'an, Pan, Zhanhao
The credit spread is a key indicator in bond investments, offering valuable insights for fixed-income investors to devise effective trading strategies. This study proposes a novel credit spread forecasting model leveraging ensemble learning techniques. To enhance predictive accuracy, a feature selection method based on mutual information is incorporated. Empirical results demonstrate that the proposed methodology delivers superior accuracy in credit spread predictions. Additionally, we present a forecast of future credit spread trends using current data, providing actionable insights for investment decisionmaking. Credit spread has long been a critical focus for investors, particularly in the context of investment-grade corporate bonds, which have garnered even greater attention.
Experimental Machine Learning with Classical and Quantum Data via NMR Quantum Kernels
Kernel methods map data into high-dimensional spaces, enabling linear algorithms to learn nonlinear functions without explicitly storing the feature vectors. Quantum kernel methods promise efficient learning by encoding feature maps into exponentially large Hilbert spaces inherent in quantum systems. In this work we implement quantum kernels on a 10-qubit star-topology register in a nuclear magnetic resonance (NMR) platform. We experimentally encode classical data in the evolution of multiple quantum coherence orders using data-dependent unitary transformations and then demonstrate one-dimensional regression and two-dimensional classification tasks. By extending the register to a double-layered star configuration, we propose an extended quantum kernel to handle non-parametrized operator inputs. By numerically simulating the extended quantum kernel, we show classification of entangling and nonentangling unitaries. These results confirm that quantum kernels exhibit strong capabilities in classical as well as quantum machine learning tasks.
Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Burlachenko, Konstantin, Richtรกrik, Peter
Federated Learning (FL) is an emerging paradigm that enables intelligent agents to collaboratively train Machine Learning (ML) models in a distributed manner, eliminating the need for sharing their local data. The recent work (arXiv:2106.02969) introduces a family of Federated Newton Learn (FedNL) algorithms, marking a significant step towards applying second-order methods to FL and large-scale optimization. However, the reference FedNL prototype exhibits three serious practical drawbacks: (i) It requires 4.8 hours to launch a single experiment in a sever-grade workstation; (ii) The prototype only simulates multi-node setting; (iii) Prototype integration into resource-constrained applications is challenging. To bridge the gap between theory and practice, we present a self-contained implementation of FedNL, FedNL-LS, FedNL-PP for single-node and multi-node settings. Our work resolves the aforementioned issues and reduces the wall clock time by x1000. With this FedNL outperforms alternatives for training logistic regression in a single-node -- CVXPY (arXiv:1603.00943), and in a multi-node -- Apache Spark (arXiv:1505.06807), Ray/Scikit-Learn (arXiv:1712.05889). Finally, we propose two practical-orientated compressors for FedNL - adaptive TopLEK and cache-aware RandSeqK, which fulfill the theory of FedNL.
TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model Discovery
Muthyala, Madhav, Sorourifar, Farshud, Paulson, Joel A.
First principles models, derived from fundamental physical laws, have been instrumental in the development of scientific theories and technological systems. For example, the Navier-Stokes equation offers a comprehensive description of fluid flow, enabling predictions of complex behaviors in everything from blood flow [1] to weather patterns [2]. Traditionally, this pursuit has relied on the extensive expertise of domain specialists, requiring trial and error to identify features and model structures that fit the observations. In recent years, the landscape of scientific inquiry has been transformed by the availability of machine learning frameworks, such as neural networks, support vector machines, and Gaussian processes, which offer a powerful alternative for deriving predictive models [3]. These data-driven regression methods are often complex, do not typically generalize outside of the training set, and provide limited insights into the underlying physics. For instance, while these models may be trained to accurately predict the Reynolds number, they cannot capture the competitive nature between inertial and viscous forces in fluid flow. The only data-driven modeling framework that can provide insights comparable to first principles models, to the best of our knowledge, is symbolic regression (SR) [4, 5, 6].
Derivative-Based Mir Spectroscopy for Blood Glucose Estimation Using Pca-Driven Regression Models
Mansourlakouraj, Saeed, Barati, Hadi, Fardmanesh, Mehdi
In this study, we presented two innovative methods, which are Threshold-Based Derivative (TBD) and Adaptive Derivative Peak Detection(ADPD), that enhance the accuracy of Learning models for blood glucose estimation using Mid-Infrared (MIR) spectroscopy. In these presented methods, we have enhanced the model's accuracy by integrating absorbance data and its differentiation with critical points. Blood samples were characterized with Fourier Transform Infrared (FTIR) spectroscopy and advanced preprocessing steps. The learning models were Ridge Regression and Support Vector Regression(SVR) using Leave-One-out Cross-Validation. Results exhibited that TBD and ADPD significantly outperform basic used methods. For SVR, the TBD increased the r2 score by around 27%, and ADPD increased it by around 10%. these Ridge Regression values were between 36% and 24%. In addition, Results demonstrate that TBD and ADPD significantly outperform conventional methods, achieving lower error rates and improved clinical accuracy, validated through Clarke and Parkes Error Grid Analysis.