Goto

Collaborating Authors

 Regression


Local minima of the empirical risk in high dimension: General theorems and convex examples

arXiv.org Machine Learning

We consider a general model for high-dimensional empirical risk minimization whereby the data $\mathbf{x}_i$ are $d$-dimensional isotropic Gaussian vectors, the model is parametrized by $\mathbf{\Theta}\in\mathbb{R}^{d\times k}$, and the loss depends on the data via the projection $\mathbf{\Theta}^\mathsf{T}\mathbf{x}_i$. This setting covers as special cases classical statistics methods (e.g. multinomial regression and other generalized linear models), but also two-layer fully connected neural networks with $k$ hidden neurons. We use the Kac-Rice formula from Gaussian process theory to derive a bound on the expected number of local minima of this empirical risk, under the proportional asymptotics in which $n,d\to\infty$, with $n\asymp d$. Via Markov's inequality, this bound allows to determine the positions of these minimizers (with exponential deviation bounds) and hence derive sharp asymptotics on the estimation and prediction error. In this paper, we apply our characterization to convex losses, where high-dimensional asymptotics were not (in general) rigorously established for $k\ge 2$. We show that our approach is tight and allows to prove previously conjectured results. In addition, we characterize the spectrum of the Hessian at the minimizer. A companion paper applies our general result to non-convex examples.


A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

arXiv.org Machine Learning

Conformal prediction provides a powerful framework for constructing distribution-free prediction regions with finite-sample coverage guarantees. While extensively studied in univariate settings, its extension to multi-output problems presents additional challenges, including complex output dependencies and high computational costs, and remains relatively underexplored. In this work, we present a unified comparative study of nine conformal methods with different multivariate base models for constructing multivariate prediction regions within the same framework. This study highlights their key properties while also exploring the connections between them. Additionally, we introduce two novel classes of conformity scores for multi-output regression that generalize their univariate counterparts. These scores ensure asymptotic conditional coverage while maintaining exact finite-sample marginal coverage. One class is compatible with any generative model, offering broad applicability, while the other is computationally efficient, leveraging the properties of invertible generative models. Finally, we conduct a comprehensive empirical evaluation across 13 tabular datasets, comparing all the multi-output conformal methods explored in this work. To ensure a fair and consistent comparison, all methods are implemented within a unified code base.


Supervised Similarity for High-Yield Corporate Bonds with Quantum Cognition Machine Learning

arXiv.org Machine Learning

We investigate the application of quantum cognition machine learning (QCML), a novel paradigm for both supervised and unsupervised learning tasks rooted in the mathematical formalism of quantum theory, to distance metric learning in corporate bond markets. Compared to equities, corporate bonds are relatively illiquid and both trade and quote data in these securities are relatively sparse. Thus, a measure of distance/similarity among corporate bonds is particularly useful for a variety of practical applications in the trading of illiquid bonds, including the identification of similar tradable alternatives, pricing securities with relatively few recent quotes or trades, and explaining the predictions and performance of ML models based on their training data. Previous research has explored supervised similarity learning based on classical tree-based models in this context; here, we explore the application of the QCML paradigm for supervised distance metric learning in the same context, showing that it outperforms classical tree-based models in high-yield (HY) markets, while giving comparable or better performance (depending on the evaluation metric) in investment grade (IG) markets.


Carelessness Detection using Performance Factor Analysis: A New Operationalization with Unexpectedly Different Relationship to Learning

arXiv.org Artificial Intelligence

--Detection of carelessness in digital learning platforms has relied on the contextual slip model, which leverages conditional probability and Bayesian Knowledge Tracing (BKT) to identify careless errors, where students make mistakes despite having the knowledge. However, this model cannot effectively assess carelessness in questions tagged with multiple skills due to the use of conditional probability. This limitation narrows the scope within which the model can be applied. Thus, we propose a novel model, the Beyond-Knowledge Feature Carelessness (BKFC) model. The model detects careless errors using performance factor analysis (PF A) and behavioral features distilled from log data, controlling for knowledge when detecting carelessness. We applied the BKFC to detect carelessness in data from middle school students playing a learning game on decimal numbers and operations. We conducted analyses comparing the careless errors detected using contextual slip to the BKFC model. Unexpectedly, careless errors identified by these two approaches did not align. We found students' post-test performance was (corresponding to past results) positively associated with the carelessness detected using the contextual slip model, while negatively associated with the carelessness detected using the BKFC model. These results highlight the complexity of carelessness and underline a broader challenge in operationalizing carelessness and careless errors. Academic discussions of carelessness in classrooms date back to the 1950s [1]. Often viewed as the result of ineffective self-regulation, carelessness is thought to occur when students commit hurried or impulsive behaviors that result in mistakes on problems that could have been answered correctly. By distinguishing mistakes made due to carelessness from those caused by other factors, such as lack of knowledge, adaptive instruction can be provided to engage or reengage students in the effective use of self-regulation during the process of problem-solving. In the last several decades, two streams of work have run in parallel to investigate carelessness and detect careless behaviors.


Neurosymbolic AI for Travel Demand Prediction: Integrating Decision Tree Rules into Neural Networks

arXiv.org Artificial Intelligence

Travel demand prediction is crucial for optimizing transportation planning, resource allocation, and infrastructure development, ensuring efficient mobility and economic sustainability. This study introduces a Neurosymbolic Artificial Intelligence (Neurosymbolic AI) framework that integrates decision tree (DT)-based symbolic rules with neural networks (NNs) to predict travel demand, leveraging the interpretability of symbolic reasoning and the predictive power of neural learning. The framework utilizes data from diverse sources, including geospatial, economic, and mobility datasets, to build a comprehensive feature set. DTs are employed to extract interpretable if-then rules that capture key patterns, which are then incorporated as additional features into a NN to enhance its predictive capabilities. Experimental results show that the combined dataset, enriched with symbolic rules, consistently outperforms standalone datasets across multiple evaluation metrics, including Mean Absolute Error (MAE), \(R^2\), and Common Part of Commuters (CPC). Rules selected at finer variance thresholds (e.g., 0.0001) demonstrate superior effectiveness in capturing nuanced relationships, reducing prediction errors, and aligning with observed commuter patterns. By merging symbolic and neural learning paradigms, this Neurosymbolic approach achieves both interpretability and accuracy.


What should an AI assessor optimise for?

arXiv.org Artificial Intelligence

An AI assessor is an external, ideally indepen-dent system that predicts an indicator, e.g., a loss value, of another AI system. Assessors can lever-age information from the test results of many other AI systems and have the flexibility of be-ing trained on any loss function or scoring rule: from squared error to toxicity metrics. Here we address the question: is it always optimal to train the assessor for the target metric? Or could it be better to train for a different metric and then map predictions back to the target metric? Us-ing twenty regression and classification problems with tabular data, we experimentally explore this question for, respectively, regression losses and classification scores with monotonic and non-monotonic mappings and find that, contrary to intuition, optimising for more informative met-rics is not generally better. Surprisingly, some monotonic transformations are promising. For example, the logistic loss is useful for minimis-ing absolute or quadratic errors in regression, and the logarithmic score helps maximise quadratic or spherical scores in classification.


Machine Learning Models for Reinforced Concrete Pipes Condition Prediction: The State-of-the-Art Using Artificial Neural Networks and Multiple Linear Regression in a Wisconsin Case Study

arXiv.org Artificial Intelligence

The aging sewer infrastructure in the U.S., covering 2.1 million kilometers, encounters increasing structural issues, resulting in around 75,000 yearly sanitary sewer overflows that present serious economic, environmental, and public health hazards. Conventional inspection techniques and deterministic models do not account for the unpredictable nature of sewer decline, whereas probabilistic methods depend on extensive historical data, which is frequently lacking or incomplete. This research intends to enhance predictive accuracy for the condition of sewer pipelines through machine learning models artificial neural networks (ANNs) and multiple linear regression (MLR) by integrating factors such as pipe age, material, diameter, environmental influences, and PACP ratings. ANNs utilized ReLU activation functions and Adam optimization, whereas MLR applied regularization to address multicollinearity, with both models assessed through metrics like RMSE, MAE, and R2. The findings indicated that ANNs surpassed MLR, attaining an R2 of 0.9066 compared to MLRs 0.8474, successfully modeling nonlinear relationships while preserving generalization. MLR, on the other hand, offered enhanced interpretability by pinpointing significant predictors such as residual buildup. As a result, pipeline degradation is driven by pipe length, age, and pipe diameter as key predictors, while depth, soil type, and segment show minimal influence in this analysis. Future studies ought to prioritize hybrid models that merge the accuracy of ANNs with the interpretability of MLR, incorporating advanced methods such as SHAP analysis and transfer learning to improve scalability in managing infrastructure and promoting environmental sustainability.


Enhance Learning Efficiency of Oblique Decision Tree via Feature Concatenation

arXiv.org Machine Learning

Oblique Decision Tree (ODT) separates the feature space by linear projections, as opposed to the conventional Decision Tree (DT) that forces axis-parallel splits. ODT has been proven to have a stronger representation ability than DT, as it provides a way to create shallower tree structures while still approximating complex decision boundaries. However, its learning efficiency is still insufficient, since the linear projections cannot be transmitted to the child nodes, resulting in a waste of model parameters. In this work, we propose an enhanced ODT method with Feature Concatenation (\texttt{FC-ODT}), which enables in-model feature transformation to transmit the projections along the decision paths. Theoretically, we prove that our method enjoys a faster consistency rate w.r.t. the tree depth, indicating that our method possesses a significant advantage in generalization performance, especially for shallow trees. Experiments show that \texttt{FC-ODT} can outperform the other state-of-the-art decision trees with a limited tree depth.


Dynamics of Transient Structure in In-Context Linear Regression Transformers

arXiv.org Artificial Intelligence

Modern deep neural networks display striking examples of rich internal computational structure. Uncovering principles governing the development of such structure is a priority for the science of deep learning. In this paper, we explore the transient ridge phenomenon: when transformers are trained on in-context linear regression tasks with intermediate task diversity, they initially behave like ridge regression before specializing to the tasks in their training distribution. This transition from a general solution to a specialized solution is revealed by joint trajectory principal component analysis. Further, we draw on the theory of Bayesian internal model selection to suggest a general explanation for the phenomena of transient structure in transformers, based on an evolving tradeoff between loss and complexity. We empirically validate this explanation by measuring the model complexity of our transformers as defined by the local learning coefficient.


In Pursuit of Predictive Models of Human Preferences Toward AI Teammates

arXiv.org Artificial Intelligence

We seek measurable properties of AI agents that make them better or worse teammates from the subjective perspective of human collaborators. Our experiments use the cooperative card game Hanabi -- a common benchmark for AI-teaming research. We first evaluate AI agents on a set of objective metrics based on task performance, information theory, and game theory, which are measurable without human interaction. Next, we evaluate subjective human preferences toward AI teammates in a large-scale (N=241) human-AI teaming experiment. Finally, we correlate the AI-only objective metrics with the human subjective preferences. Our results refute common assumptions from prior literature on reinforcement learning, revealing new correlations between AI behaviors and human preferences. We find that the final game score a human-AI team achieves is less predictive of human preferences than esoteric measures of AI action diversity, strategic dominance, and ability to team with other AI. In the future, these correlations may help shape reward functions for training human-collaborative AI.