Goto

Collaborating Authors

 lyon



Random Controlled Differential Equations

Piatti, Francesco, Cass, Thomas, Turner, William F.

arXiv.org Machine Learning

We introduce a training-efficient framework for time-series learning that combines random features with controlled differential equations (CDEs). In this approach, large randomly parameterized CDEs act as continuous-time reservoirs, mapping input paths to rich representations. Only a linear readout layer is trained, resulting in fast, scalable models with strong inductive bias. Building on this foundation, we propose two variants: (i) Random Fourier CDEs (RF-CDEs): these lift the input signal using random Fourier features prior to the dynamics, providing a kernel-free approximation of RBF-enhanced sequence models; (ii) Random Rough DEs (R-RDEs): these operate directly on rough-path inputs via a log-ODE discretization, using log-signatures to capture higher-order temporal interactions while remaining stable and efficient. We prove that in the infinite-width limit, these model induces the RBF-lifted signature kernel and the rough signature kernel, respectively, offering a unified perspective on random-feature reservoirs, continuous-time deep architectures, and path-signature theory. We evaluate both models across a range of time-series benchmarks, demonstrating competitive or state-of-the-art performance. These methods provide a practical alternative to explicit signature computations, retaining their inductive bias while benefiting from the efficiency of random features.


Novelty detection on path space

Gasteratos, Ioannis, Jacquier, Antoine, Lemercier, Maud, Lyons, Terry, Salvi, Cristopher

arXiv.org Machine Learning

We frame novelty detection on path space as a hypothesis testing problem with signature-based test statistics. Using transportation-cost inequalities of Gasteratos and Jacquier (2023), we obtain tail bounds for false positive rates that extend beyond Gaussian measures to laws of RDE solutions with smooth bounded vector fields, yielding estimates of quantiles and p-values. Exploiting the shuffle product, we derive exact formulae for smooth surrogates of conditional value-at-risk (CVaR) in terms of expected signatures, leading to new one-class SVM algorithms optimising smooth CVaR objectives. We then establish lower bounds on type-$\mathrm{II}$ error for alternatives with finite first moment, giving general power bounds when the reference measure and the alternative are absolutely continuous with respect to each other. Finally, we evaluate numerically the type-$\mathrm{I}$ error and statistical power of signature-based test statistic, using synthetic anomalous diffusion data and real-world molecular biology data.



Local regression on path spaces with signature metrics

Bayer, Christian, Gogolashvili, Davit, Pelizzari, Luca

arXiv.org Machine Learning

We study nonparametric regression and classification for path-valued data. We introduce a functional Nadaraya-Watson estimator that combines the signature transform from rough path theory with local kernel regression. The signature transform provides a principled way to encode sequential data through iterated integrals, enabling direct comparison of paths in a natural metric space. Our approach leverages signature-induced distances within the classical kernel regression framework, achieving computational efficiency while avoiding the scalability bottlenecks of large-scale kernel matrix operations. We establish finite-sample convergence bounds demonstrating favorable statistical properties of signature-based distances compared to traditional metrics in infinite-dimensional settings. We propose robust signature variants that provide stability against outliers, enhancing practical performance. Applications to both synthetic and real-world data - including stochastic differential equation learning and time series classification - demonstrate competitive accuracy while offering significant computational advantages over existing methods.



pySigLib -- Fast Signature-Based Computations on CPU and GPU

Shmelev, Daniil, Salvi, Cristopher

arXiv.org Machine Learning

Signature-based methods have recently gained significant traction in machine learning for sequential data. In particular, signature kernels have emerged as powerful discriminators and training losses for generative models on time-series, notably in quantitative finance. However, existing implementations do not scale to the dataset sizes and sequence lengths encountered in practice. We present pySigLib, a high-performance Python library offering optimised implementations of signatures and signature kernels on CPU and GPU, fully compatible with PyTorch's automatic differentiation. Beyond an efficient software stack for large-scale signature-based computation, we introduce a novel differentiation scheme for signature kernels that delivers accurate gradients at a fraction of the runtime of existing libraries.


ICE chief warns AI technology could lead to safety risks for agents: 'Fringe organizations'

FOX News

Acting ICE Director Todd Lyons explains how far-left groups are using'reverse technology' to reveal the identities of federal immigration officers. Far-left organizations could be using artificial intelligence and other technology to reveal the identity of Immigration and Customs Enforcement agents, acting ICE Director Todd Lyons told Fox News Digital in an interview. Lyons' remarks come as Democrats in Congress recently proposed the VISIBLE Act, which would require clear identification of ICE agents and prevent masking of federal immigration authorities in public-facing circumstances. "If legislation passes to try to unmask ICE agents, they are not allowed to wear them, it runs the risk of agitators, different groups, you know, these fringe organizations using reverse technology, AI, to try to dox their families, try to get their identity, their home addresses," Lyons said of the reaction from agents on the ground. "We've heard elected officials say there shouldn't be any rest for ICE agents or their families. "So they're definitely concerned about that.


Leveraging Graph Structures and Large Language Models for End-to-End Synthetic Task-Oriented Dialogues

Medjad, Maya, Imbert, Hugo, Yun, Bruno, Szymocha, Raphaël, Armetta, Frédéric

arXiv.org Artificial Intelligence

Training task-oriented dialogue systems is both costly and time-consuming, due to the need for high-quality datasets encompassing diverse intents. Traditional methods depend on extensive human annotation, while recent advancements leverage large language models (LLMs) to generate synthetic data. However, these approaches often require custom prompts or code, limiting accessibility for non-technical users. We introduce GraphTOD, an end-to-end framework that simplifies the generation of task-oriented dialogues. Users can create dialogues by specifying transition graphs in JSON format. Our evaluation demonstrates that GraphTOD generates high-quality dialogues across various domains, significantly lowering the cost and complexity of dataset creation.


Rough kernel hedging

Cirone, Nicola Muca, Salvi, Cristopher

arXiv.org Machine Learning

Building on the functional-analytic framework of operator-valued kernels and un-truncated signature kernels, we propose a scalable, provably convergent signature-based algorithm for a broad class of high-dimensional, path-dependent hedging problems. We make minimal assumptions about market dynamics by modelling them as general geometric rough paths, yielding a fully model-free approach. Furthermore, through a representer theorem, we provide theoretical guarantees on the existence and uniqueness of a global minimum for the resulting optimization problem and derive an analytic solution under highly general loss functions. Similar to the popular deep hedging approach, but in a more rigorous fashion, our method can also incorporate additional features via the underlying operator-valued kernel, such as trading signals, news analytics, and past hedging decisions, closely aligning with true machine-learning practice.