prq
On the contraction properties of Sinkhorn semigroups
Akyildiz, O. Deniz, del Moral, Pierre, Miguez, Joaquin
We develop a novel semigroup contraction analysis based on Lyapunov techniques to prove the exponential convergence of Sinkhorn equations on weighted Banach spaces. This operator-theoretic framework yields exponential decays of Sinkhorn iterates towards Schr\"odinger bridges with respect to general classes of $\phi$-divergences as well as in weighted Banach spaces. To the best of our knowledge, these are the first results of this type in the literature on entropic transport and the Sinkhorn algorithm. We also illustrate the impact of these results in the context of multivariate linear Gaussian models as well as statistical finite mixture models including Gaussian-kernel density estimation of complex data distributions arising in generative models.
- North America > United States (0.14)
- Europe > Spain (0.14)
Reward Compatibility: A Framework for Inverse RL
Lazzati, Filippo, Mutti, Mirco, Metelli, Alberto
We provide an original theoretical study of Inverse Reinforcement Learning (IRL) through the lens of reward compatibility, a novel framework to quantify the compatibility of a reward with the given expert's demonstrations. Intuitively, a reward is more compatible with the demonstrations the closer the performance of the expert's policy computed with that reward is to the optimal performance for that reward. This generalizes the notion of feasible reward set, the most common framework in the theoretical IRL literature, for which a reward is either compatible or not compatible. The grayscale introduced by the reward compatibility is the key to extend the realm of provably efficient IRL far beyond what is attainable with the feasible reward set: from tabular to large-scale MDPs. We analyze the IRL problem across various settings, including optimal and suboptimal expert's demonstrations and both online and offline data collection. For all of these dimensions, we provide a tractable algorithm and corresponding sample complexity analysis, as well as various insights on reward compatibility and how the framework can pave the way to yet more general problem settings.
- Europe > Italy > Lombardy > Milan (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
LASER: A new method for locally adaptive nonparametric regression
Chatterjee, Sabyasachi, Goswami, Subhajit, Mukherjee, Soumendu Sundar
In this article, we introduce \textsf{LASER} (Locally Adaptive Smoothing Estimator for Regression), a computationally efficient locally adaptive nonparametric regression method that performs variable bandwidth local polynomial regression. We prove that it adapts (near-)optimally to the local H\"{o}lder exponent of the underlying regression function \texttt{simultaneously} at all points in its domain. Furthermore, we show that there is a single ideal choice of a global tuning parameter under which the above mentioned local adaptivity holds. Despite the vast literature on nonparametric regression, instances of practicable methods with provable guarantees of such a strong notion of local adaptivity are rare. The proposed method achieves excellent performance across a broad range of numerical experiments in comparison to popular alternative locally adaptive methods.
- North America > United States > New York (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
Universal approximation results for neural networks with non-polynomial activation function over non-compact domains
Neufeld, Ariel, Schmocker, Philipp
More precisely, by assuming that the activation function is non-polynomial, we derive universal approximation results for neural networks within function spaces over non-compact subsets of a Euclidean space, e.g., weighted spaces, L Furthermore, we provide some dimension-independent rates for approximating a function with sufficiently regular and integrable Fourier transform by neural networks with non-polynomial activation function. Inspired by the functionality of human brains, (artificial) neural networks have been discovered in the seminal work of McCulloch and Pitts (see [32]). Fundamentally, a neural network consists of nodes arranged in hierarchical layers, where the connections between adjacent layers transmit the data through the network and the nodes transform this information. In mathematical terms, a neural network can therefore be described as a concatenation of affine and non-affine functions. Nowadays, neural networks are successfully applied in the fields of image classification (see e.g.
- North America > United States > Massachusetts > Suffolk County > Boston (0.27)
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (3 more...)
Proper losses regret at least 1/2-order
A fundamental challenge in machine learning is the choice of a loss as it characterizes our learning task, is minimized in the training phase, and serves as an evaluation criterion for estimators. Proper losses are commonly chosen, ensuring minimizers of the full risk match the true probability vector. Estimators induced from a proper loss are widely used to construct forecasters for downstream tasks such as classification and ranking. In this procedure, how does the forecaster based on the obtained estimator perform well under a given downstream task? This question is substantially relevant to the behavior of the $p$-norm between the estimated and true probability vectors when the estimator is updated. In the proper loss framework, the suboptimality of the estimated probability vector from the true probability vector is measured by a surrogate regret. First, we analyze a surrogate regret and show that the strict properness of a loss is necessary and sufficient to establish a non-vacuous surrogate regret bound. Second, we solve an important open question that the order of convergence in p-norm cannot be faster than the $1/2$-order of surrogate regrets for a broad class of strictly proper losses. This implies that strongly proper losses entail the optimal convergence rate.
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- (2 more...)
Offline Inverse RL: New Solution Concepts and Provably Efficient Algorithms
Lazzati, Filippo, Mutti, Mirco, Metelli, Alberto Maria
Inverse reinforcement learning (IRL) aims to recover the reward function of an expert agent from demonstrations of behavior. It is well-known that the IRL problem is fundamentally ill-posed, i.e., many reward functions can explain the demonstrations. For this reason, IRL has been recently reframed in terms of estimating the feasible reward set (Metelli et al., 2021), thus, postponing the selection of a single reward. However, so far, the available formulations and algorithmic solutions have been proposed and analyzed mainly for the online setting, where the learner can interact with the environment and query the expert at will. This is clearly unrealistic in most practical applications, where the availability of an offline dataset is a much more common scenario. In this paper, we introduce a novel notion of feasible reward set capturing the opportunities and limitations of the offline setting and we analyze the complexity of its estimation. This requires the introduction an original learning framework that copes with the intrinsic difficulty of the setting, for which the data coverage is not under control. Then, we propose two computationally and statistically efficient algorithms, IRLO and PIRLO, for addressing the problem. In particular, the latter adopts a specific form of pessimism to enforce the novel desirable property of inclusion monotonicity of the delivered feasible set. With this work, we aim to provide a panorama of the challenges of the offline IRL problem and how they can be fruitfully addressed.
- Europe > Austria > Vienna (0.13)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control
Lamperski, Andrew, Lekang, Tyler
Neural networks are regularly employed in adaptive control of nonlinear systems and related methods of reinforcement learning. A common architecture uses a neural network with a single hidden layer (i.e. a shallow network), in which the weights and biases are fixed in advance and only the output layer is trained. While classical results show that there exist neural networks of this type that can approximate arbitrary continuous functions over bounded regions, they are non-constructive, and the networks used in practice have no approximation guarantees. Thus, the approximation properties required for control with neural networks are assumed, rather than proved. In this paper, we aim to fill this gap by showing that for sufficiently smooth functions, ReLU networks with randomly generated weights and biases achieve $L_{\infty}$ error of $O(m^{-1/2})$ with high probability, where $m$ is the number of neurons. It suffices to generate the weights uniformly over a sphere and the biases uniformly over an interval. We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > Minnesota > Hennepin County > Plymouth (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Risk-Sensitive Diffusion: Learning the Underlying Distribution from Noisy Samples
Li, Yangming, Luyten, Max Ruiz, van der Schaar, Mihaela
While achieving remarkable performances, we show that diffusion models are fragile to the presence of noisy samples, limiting their potential in the vast amount of settings where, unlike image synthesis, we are not blessed with clean data. Motivated by our finding that such fragility originates from the distribution gaps between noisy and clean samples along the diffusion process, we introduce risk-sensitive SDE, a stochastic differential equation that is parameterized by the risk (i.e., data "dirtiness") to adjust the distributions of noisy samples, reducing misguidance while benefiting from their contained information. The optimal expression for risk-sensitive SDE depends on the specific noise distribution, and we derive its parameterizations that minimize the misguidance of noisy samples for both Gaussian and general non-Gaussian perturbations. We conduct extensive experiments on both synthetic and real-world datasets (e.g., medical time series), showing that our model effectively recovers the clean data distribution from noisy samples, significantly outperforming conditional generation baselines.
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Global universal approximation of functional input maps on weighted spaces
Cuchiero, Christa, Schmocker, Philipp, Teichmann, Josef
We introduce so-called functional input neural networks defined on a possibly infinite dimensional weighted space with values also in a possibly infinite dimensional output space. To this end, we use an additive family as hidden layer maps and a non-linear activation function applied to each hidden layer. Relying on Stone-Weierstrass theorems on weighted spaces, we can prove a global universal approximation result for generalizations of continuous functions going beyond the usual approximation on compact sets. This then applies in particular to approximation of (non-anticipative) path space functionals via functional input neural networks. As a further application of the weighted Stone-Weierstrass theorem we prove a global universal approximation result for linear functions of the signature. We also introduce the viewpoint of Gaussian process regression in this setting and show that the reproducing kernel Hilbert space of the signature kernels are Cameron-Martin spaces of certain Gaussian processes. This paves the way towards uncertainty quantification for signature kernel regression.
- Europe > Austria > Vienna (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- (10 more...)
Online Resource Allocation with Convex-set Machine-Learned Advice
Golrezaei, Negin, Jaillet, Patrick, Zhou, Zijie
Decision-makers often have access to a machine-learned prediction about demand, referred to as advice, which can potentially be utilized in online decision-making processes for resource allocation. However, exploiting such advice poses challenges due to its potential inaccuracy. To address this issue, we propose a framework that enhances online resource allocation decisions with potentially unreliable machine-learned (ML) advice. We assume here that this advice is represented by a general convex uncertainty set for the demand vector. We introduce a parameterized class of Pareto optimal online resource allocation algorithms that strike a balance between consistent and robust ratios. The consistent ratio measures the algorithm's performance (compared to the optimal hindsight solution) when the ML advice is accurate, while the robust ratio captures performance under an adversarial demand process when the advice is inaccurate. Specifically, in a C-Pareto optimal setting, we maximize the robust ratio while ensuring that the consistent ratio is at least C. Our proposed C-Pareto optimal algorithm is an adaptive protection level algorithm, which extends the classical fixed protection level algorithm introduced in Littlewood (2005) and Ball and Queyranne (2009). Solving a complex non-convex continuous optimization problem characterizes the adaptive protection level algorithm. To complement our algorithms, we present a simple method for computing the maximum achievable consistent ratio, which serves as an estimate for the maximum value of the ML advice. Additionally, we present numerical studies to evaluate the performance of our algorithm in comparison to benchmark algorithms. The results demonstrate that by adjusting the parameter C, our algorithms effectively strike a balance between worst-case and average performance, outperforming the benchmark algorithms.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)