Goto

Collaborating Authors

 aravkin


A joint optimization approach to identifying sparse dynamics using least squares kernel collocation

Hsu, Alexander W., Salas, Ike W. Griss, Stevens-Haas, Jacob M., Kutz, J. Nathan, Aravkin, Aleksandr, Hosseini, Bamdad

arXiv.org Machine Learning

The identification of ordinary differential equations (ODEs) and dynamical systems is a fundamental problem in control [32, 59, 60], data assimilation [42, 84], and more recently in scientific machine learning (ML) [11, 72, 74]. While algorithms such as Sparse Identification of Nonlinear Dynamics (SINDy) and its variants [46] are widely used by practitioners, they often fail in scenarios where observations of the state of the system are scarce, indirect, and noisy. In such scenarios modifications to SINDy-type methods are required to enforce additional constraints on the recovered equations to make them consistent with the observational data. Put simply, traditional SINDy-type methods work in two steps: (1) the data is used to filter the state of the system and estimate the derivatives, and (2) the filtered state is used to learn the underlying dynamics. In the regime of scarce, noisy and incomplete data, step 1 is inaccurate, which can propagate to poor results in the subsequent step 2. In this paper, we propose an all-at-once approach to filtering and equation learning based on collocation in a reproducing kernel Hilbert space (RKHS) which we term Joint SINDy (JSINDy), and shows that the issues above can be mitigated by performing both steps together. This joins a broader class of dynamics-informed methods that integrate the governing equations directly into the learning objective, either as hard constraints or as least-squares relaxations, which couples the problems of state estimation and model discovery. Representative examples include physics-informed and sparse-regression frameworks based on neural networks, splines, kernels, finite differences, and adjoint methods [21, 27, 39, 41, 72, 73, 88].


l1-Norm Minimization with Regula Falsi Type Root Finding Methods

Vural, Metin, Aravkin, Aleksandr Y., Stan'czak, Sławomir

arXiv.org Machine Learning

Sparse level-set formulations allow practitioners to find the minimum 1-norm solution subject to likelihood constraints. Prior art requires this constraint to be convex. In this letter, we develop an efficient approach for nonconvex likelihoods, using Regula Falsi root-finding techniques to solve the level-set formulation. Regula Falsi methods are simple, derivative-free, and efficient, and the approach provably extends level-set methods to the broader class of nonconvex inverse problems. Practical performance is illustrated using l1-regularized Student's t inversion, which is a nonconvex approach used to develop outlier-robust formulations.


Boosting as a kernel-based method

Aravkin, Aleksandr Y., Bottegal, Giulio, Pillonetto, Gianluigi

arXiv.org Machine Learning

Boosting combines weak (biased) learners to obtain effective learning algorithms for classification and prediction. In this paper, we show a connection between boosting and kernel-based methods, highlighting both theoretical and practical applications. In the context of $\ell_2$ boosting, we start with a weak linear learner defined by a kernel $K$. We show that boosting with this learner is equivalent to estimation with a special {\it boosting kernel} that depends on $K$, as well as on the regression matrix, noise variance, and hyperparameters. The number of boosting iterations is modeled as a continuous hyperparameter, and fit along with other parameters using standard techniques. We then generalize the boosting kernel to a broad new class of boosting approaches for more general weak learners, including those based on the $\ell_1$, hinge and Vapnik losses. The approach allows fast hyperparameter tuning for this general class, and has a wide range of applications, including robust regression and classification. We illustrate some of these applications with numerical examples on synthetic and real data.


Generalized system identification with stable spline kernels

Aravkin, Aleksandr Y., Burke, James V., Pillonetto, Gianluigi

arXiv.org Machine Learning

Regularized least-squares approaches have been successfully applied to linear system identification. Recent approaches use quadratic penalty terms on the unknown impulse response defined by stable spline kernels, which control model space complexity by leveraging regularity and bounded-input bounded-output stability. This paper extends linear system identification to a wide class of nonsmooth stable spline estimators, where regularization functionals and data misfits can be selected from a rich set of piecewise linear quadratic penalties. This class encompasses the 1-norm, huber, and vapnik, in addition to the least-squares penalty, and the approach allows linear inequality constraints on the unknown impulse response. We develop a customized interior point solver for the entire class of proposed formulations. By representing penalties through their conjugates, we allow a simple interface that enables the user to specify any piecewise linear quadratic penalty for misfit and regularizer, together with inequality constraints on the response. The solver is locally quadratically convergent, with O(n2(m+n)) arithmetic operations per iteration, for n impulse response coefficients and m output measurements. In the system identification context, where n << m, IPsolve is competitive with available alternatives, illustrated by a comparison with TFOCS and libSVM. The modeling framework is illustrated with a range of numerical experiments, featuring robust formulations for contaminated data, relaxation systems, and nonnegativity and unimodality constraints on the impulse response. Incorporating constraints yields significant improvements in system identification. The solver used to obtain the results is distributed via an open source code repository.


Robust and Trend Following Student's t Kalman Smoothers

Aravkin, Aleksandr Y., Burke, James V., Pillonetto, Gianluigi

arXiv.org Machine Learning

We present a Kalman smoothing framework based on modeling errors using the heavy tailed Student's t distribution, along with algorithms, convergence theory, open-source general implementation, and several important applications. The computational effort per iteration grows linearly with the length of the time series, and all smoothers allow nonlinear process and measurement models. Robust smoothers form an important subclass of smoothers within this framework. These smoothers work in situations where measurements are highly contaminated by noise or include data unexplained by the forward model. Highly robust smoothers are developed by modeling measurement errors using the Student's t distribution, and outperform the recently proposed L1-Laplace smoother in extreme situations with data containing 20% or more outliers. A second special application we consider in detail allows tracking sudden changes in the state. It is developed by modeling process noise using the Student's t distribution, and the resulting smoother can track sudden changes in the state. These features can be used separately or in tandem, and we present a general smoother algorithm and open source implementation, together with convergence analysis that covers a wide range of smoothers. A key ingredient of our approach is a technique to deal with the non-convexity of the Student's t loss function. Numerical results for linear and nonlinear models illustrate the performance of the new smoothers for robust and tracking applications, as well as for mixed problems that have both types of features.