Goto

Collaborating Authors

 spline


KANs need curvature: penalties for compositional smoothness

arXiv.org Machine Learning

However, the activations of well-fitting KANs tend to exhibit pathologically high-curvature oscillations, making them difficult to interpret, and standard regularization penalties do not prevent this. Here we derive a basis-agnostic curvature penalty and show that penalized models can maintain accuracy while achieving substantially smoother activations. Accounting for how function composition shapes curvature, we prove an upper bound on the full model's curvature relative to the curvature penalty, and use this to motivate richer forms of penalties. Scientific machine learning is increasingly bottlenecked by the trade-off between accuracy and interpretability. Results such as ours that improve interpretability without sacrificing accuracy will further strengthen KANs as a practical tool for both prediction and insight.




where Ns,k(t) = k ฯ„s+k ฯ„s Ns,k 1(t)

Neural Information Processing Systems

We will prove by the induction. Let's suppose that the formula holds for k up to n. We will prove that this formula also holds for k = n+1. By the definition in Eq. 4 and the chain rule, we can get that: Ns,n+1(t) = t ฯ„s A.2 Spline representation In this section, we give error bounds for spline representation. For simplicity, we consider 1D scenario and assume the target function u: [0,1] R is periodic and defined on the unit interval โ„ฆ = [0,1].



Learning Nonlinear Regime Transitions via Semi-Parametric State-Space Models

arXiv.org Machine Learning

We develop a semi-parametric state-space model for time-series data with latent regime transitions. Classical Markov-switching models use fixed parametric transition functions, such as logistic or probit links, which restrict flexibility when transitions depend on nonlinear and context-dependent effects. We replace this assumption with learned functions $f_0, f_1 \in \calH$, where $\calH$ is either a reproducing kernel Hilbert space or a spline approximation space, and define transition probabilities as $p_{jk,t} = \sigmoid(f(\bx_{t-1}))$. The transition functions are estimated jointly with emission parameters using a generalized Expectation-Maximization algorithm. The E-step uses the standard forward-backward recursion, while the M-step reduces to a penalized regression problem with weights from smoothed occupation measures. We establish identifiability conditions and provide a consistency argument for the resulting estimators. Experiments on synthetic data show improved recovery of nonlinear transition dynamics compared to parametric baselines. An empirical study on financial time series demonstrates improved regime classification and earlier detection of transition events.



NeuralSplineFlows

Neural Information Processing Systems

Explicit density evaluation is required in many statistical procedures, while synthesis of novel examples can enable agents to imagine and plan in an environment prior tochoosing aaction.