px 0
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
Stability and Accuracy Trade-offs in Statistical Estimation
Chakraborty, Abhinav, Luo, Yuetian, Barber, Rina Foygel
Algorithmic stability is a central concept in statistics and learning theory that measures how sensitive an algorithm's output is to small changes in the training data. Stability plays a crucial role in understanding generalization, robustness, and replicability, and a variety of stability notions have been proposed in different learning settings. However, while stability entails desirable properties, it is typically not sufficient on its own for statistical learning -- and indeed, it may be at odds with accuracy, since an algorithm that always outputs a constant function is perfectly stable but statistically meaningless. Thus, it is essential to understand the potential statistical cost of stability. In this work, we address this question by adopting a statistical decision-theoretic perspective, treating stability as a constraint in estimation. Focusing on two representative notions-worst-case stability and average-case stability-we first establish general lower bounds on the achievable estimation accuracy under each type of stability constraint. We then develop optimal stable estimators for four canonical estimation problems, including several mean estimation and regression settings. Together, these results characterize the optimal trade-offs between stability and accuracy across these tasks. Our findings formalize the intuition that average-case stability imposes a qualitatively weaker restriction than worst-case stability, and they further reveal that the gap between these two can vary substantially across different estimation problems.
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Jordan (0.04)
Leveraging Equivariances and Symmetries in the Control Barrier Function Synthesis
Wiltz, Adrian, Dimarogonas, Dimos V.
The synthesis of Control Barrier Functions (CBFs) often involves demanding computations or a meticulous construction. However, structural properties of the system dynamics and constraints have the potential to mitigate these challenges. In this paper, we explore how equivariances in the dynamics, loosely speaking a form of symmetry, can be leveraged in the CBF synthesis. Although CBFs are generally not inherently symmetric, we show how equivariances in the dynamics and symmetries in the constraints induce symmetries in CBFs derived through reachability analysis. This insight allows us to infer their CBF values across the entire domain from their values on a subset, leading to significant computational savings. Interestingly, equivariances can be even leveraged to the CBF synthesis for non-symmetric constraints. Specifically, we show how a partially known CBF can be leveraged together with equivariances to construct a CBF for various new constraints. Throughout the paper, we provide examples illustrating the theoretical findings. Furthermore, a numerical study investigates the computational gains from invoking equivariances into the CBF synthesis.
Learning non-equilibrium diffusions with Schrödinger bridges: from exactly solvable to simulation-free
Zhang, Stephen Y., Stumpf, Michael P H
We consider the Schrödinger bridge problem which, given ensemble measurements of the initial and final configurations of a stochastic dynamical system and some prior knowledge on the dynamics, aims to reconstruct the "most likely" evolution of the system compatible with the data. Most existing literature assume Brownian reference dynamics and are implicitly limited to potential-driven dynamics. We depart from this regime and consider reference processes described by a multivariate Ornstein-Uhlenbeck process with generic drift matrix $\mathbf{A} \in \mathbb{R}^{d \times d}$. When $\mathbf{A}$ is asymmetric, this corresponds to a non-equilibrium system with non-conservative forces at play: this is important for applications to biological systems, which are naturally exist out-of-equilibrium. In the case of Gaussian marginals, we derive explicit expressions that characterise the solution of both the static and dynamic Schrödinger bridge. For general marginals, we propose mvOU-OTFM, a simulation-free algorithm based on flow and score matching for learning the Schrödinger bridge. In application to a range of problems based on synthetic and real single cell data, we demonstrate that mvOU-OTFM achieves higher accuracy compared to competing methods, whilst being significantly faster to train.
Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning
Patil, Pratik, Du, Jin-Hong, Tibshirani, Ryan J.
Common practice in modern machine learning involves fitting a large number of parameters relative to the number of observations. These overparameterized models can exhibit surprising generalization behavior, e.g., ``double descent'' in the prediction error curve when plotted against the raw number of model parameters, or another simplistic notion of complexity. In this paper, we revisit model complexity from first principles, by first reinterpreting and then extending the classical statistical concept of (effective) degrees of freedom. Whereas the classical definition is connected to fixed-X prediction error (in which prediction error is defined by averaging over the same, nonrandom covariate points as those used during training), our extension of degrees of freedom is connected to random-X prediction error (in which prediction error is averaged over a new, random sample from the covariate distribution). The random-X setting more naturally embodies modern machine learning problems, where highly complex models, even those complex enough to interpolate the training data, can still lead to desirable generalization performance under appropriate conditions. We demonstrate the utility of our proposed complexity measures through a mix of conceptual arguments, theory, and experiments, and illustrate how they can be used to interpret and compare arbitrary prediction models.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Research Report (0.50)
- Overview (0.45)
A Low Rank Neural Representation of Entropy Solutions
We construct a new representation of entropy solutions to nonlinear scalar conservation laws with a smooth convex flux function in a single spatial dimension. The representation is a generalization of the method of characteristics and posseses a compositional form. While it is a nonlinear representation, the embedded dynamics of the solution in the time variable is linear. This representation is then discretized as a manifold of implicit neural representations where the feedforward neural network architecture has a low rank structure. Finally, we show that the low rank neural representation with a fixed number of layers and a small number of coefficients can approximate any entropy solution regardless of the complexity of the shock topology, while retaining the linearity of the embedded dynamics.
Mamba State-Space Models Can Be Strong Downstream Learners
Halloran, John T., Gulati, Manbir, Roysdon, Paul F.
Mamba state-space models (SSMs) have recently outperformed state-of-the-art (SOTA) Transformer large language models (LLMs) in various tasks and been widely adapted. However, Mamba's downstream learning capabilities remain either unexplored$\unicode{x2013}$e.g., mixed-precision (MPFT) and parameter-efficient fine-tuning (PEFT)--or under-evaluated$\unicode{x2013}$e.g., in-context learning (ICL). For the latter, recent works reported Mamba's ICL rivals SOTA Transformer LLMs using non-standard benchmarks. In contrast, we show that on standard benchmarks, pretrained Mamba models achieve only 38% of the ICL performance improvements (over zero-shot) of comparable Transformers. Enabling MPFT and PEFT in Mamba architectures is challenging due to recurrent dynamics and highly customized CUDA kernels, respectively. However, we prove that Mamba's recurrent dynamics are robust to small input changes using dynamical systems theory. Empirically, we show that performance changes in Mamba's inference and fine-tuning due to mixed-precision align with Transformer LLMs. Furthermore, we show that targeting key memory buffers in Mamba's customized CUDA kernels for low-rank adaptation regularizes SSM parameters, thus achieving parameter efficiency while retaining speedups. We show that combining MPFT and PEFT enables up to 2.15 times more tokens-per-second and 65.5% reduced per-token-memory compared to full Mamba fine-tuning, while achieving up to 81.5% of the ICL performance improvements (over zero-shot) of comparably fine-tuned Transformers.
Local Observability of VINS and LINS
Under the assumption that there exist two features observed by the camera without occlusion, the unobservable directions of VINS are uniformly globally translation and global rotations about the gravity vector. The unobservable directions of LINS are same as VINS, while only one feature need to be observed. Also, a constraint in Observability-Constrained VINS (OC-VINS) is proved.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Minnesota > Ramsey County > Saint Paul (0.04)
Statistical Inference of Optimal Allocations I: Regularities and their Implications
In this paper, we develop a functional differentiability approach for solving statistical optimal allocation problems. We first derive Hadamard differentiability of the value function through a detailed analysis of the general properties of the sorting operator. Central to our framework are the concept of Hausdorff measure and the area and coarea integration formulas from geometric measure theory. Building on our Hadamard differentiability results, we demonstrate how the functional delta method can be used to directly derive the asymptotic properties of the value function process for binary constrained optimal allocation problems, as well as the two-step ROC curve estimator. Moreover, leveraging profound insights from geometric functional analysis on convex and local Lipschitz functionals, we obtain additional generic Fr\'echet differentiability results for the value functions of optimal allocation problems. These compelling findings motivate us to study carefully the first order approximation of the optimal social welfare. In this paper, we then present a double / debiased estimator for the value functions. Importantly, the conditions outlined in the Hadamard differentiability section validate the margin assumption from the statistical classification literature employing plug-in methods that justifies a faster convergence rate.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)