AITopics | Viitasaari, Lauri

Collaborating Authors

Viitasaari, Lauri

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints

Malo, Pekka, Viitasaari, Lauri, Suominen, Antti, Vilkkumaa, Eeva, Tahvonen, Olli

arXiv.org Machine LearningNov-28-2024

This paper studies reinforcement learning (RL) in infinite-horizon dynamic decision processes with almost-sure safety constraints. Such safety-constrained decision processes are central to applications in autonomous systems, finance, and resource management, where policies must satisfy strict, state-dependent constraints. We consider a doubly-regularized RL framework that combines reward and parameter regularization to address these constraints within continuous state-action spaces. Specifically, we formulate the problem as a convex regularized objective with parametrized policies in the mean-field regime. Our approach leverages recent developments in mean-field theory and Wasserstein gradient flows to model policies as elements of an infinite-dimensional statistical manifold, with policy updates evolving via gradient flows on the space of parameter distributions. Our main contributions include establishing solvability conditions for safety-constrained problems, defining smooth and bounded approximations that facilitate gradient flows, and demonstrating exponential convergence towards global solutions under sufficient regularization. We provide general conditions on regularization functions, encompassing standard entropy regularization as a special case. The results also enable a particle method implementation for practical RL applications. The theoretical insights and convergence guarantees presented here offer a robust framework for safe RL in complex, high-dimensional decision-making problems.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2411.19193

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report > New Finding (0.45)

Industry:

Government > Regional Government > North America Government (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

Concentration inequalities for leave-one-out cross validation

Avelin, Benny, Viitasaari, Lauri

arXiv.org Machine LearningOct-16-2023

In this article we prove that estimator stability is enough to show that leave-one-out cross validation is a sound procedure, by providing concentration bounds in a general framework. In particular, we provide concentration bounds beyond Lipschitz continuity assumptions on the loss or on the estimator. We obtain our results by relying on random variables with distribution satisfying the logarithmic Sobolev inequality, providing us a relatively rich class of distributions. We illustrate our method by considering several interesting examples, including linear regression, kernel density estimation, and stabilized/truncated estimators such as stabilized kernel regression. Mathematics Subject Classifications (2020): 62R07, 62G05, 60F15 Keywords: Leave-one-out cross validation, concentration inequalities, logarithmic Sobolev inequality, sub-Gaussian random variables 1. Introduction It is customary in many statistical and machine learning methods to use train-validation, where the data is split into a training set and a validation set. Then the training set is used for estimating (training) the model, while the validation set is used to measure the performance of the fitted model.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

2211.02478

Country: Europe (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback