AITopics | Kaledin, Maxim

Collaborating Authors

Kaledin, Maxim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach

Ibrahim, Anwar, Derkach, Denis, Petrenko, Alexey, Ratnikov, Fedor, Kaledin, Maxim

arXiv.org Artificial IntelligenceMar-12-2025

Optimizing accelerator control is a critical challenge in experimental particle physics, requiring significant manual effort and resource expenditure. Traditional tuning methods are often time-consuming and reliant on expert input, highlighting the need for more efficient approaches. This study aims to create a simulation-based framework integrated with Reinforcement Learning (RL) to address these challenges. Using \texttt{Elegant} as the simulation backend, we developed a Python wrapper that simplifies the interaction between RL algorithms and accelerator simulations, enabling seamless input management, simulation execution, and output analysis. The proposed RL framework acts as a co-pilot for physicists, offering intelligent suggestions to enhance beamline performance, reduce tuning time, and improve operational efficiency. As a proof of concept, we demonstrate the application of our RL approach to an accelerator control problem and highlight the improvements in efficiency and performance achieved through our methodology. We discuss how the integration of simulation tools with a Python-based RL framework provides a powerful resource for the accelerator physics community, showcasing the potential of machine learning in optimizing complex physical systems.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2503.09665

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Kaledin, Maxim, Moulines, Eric, Naumov, Alexey, Tadic, Vladislav, Wai, Hoi-To

arXiv.org Machine LearningFeb-4-2020

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is o (1 /k c) and the steady-state term is O (1 /k), where c 1 and k is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of Ω(1 /k). A simple numerical experiment is presented to support our theory. Keywords: stochastic approximation, reinforcement learning, GTD learning, Markovian noise 1. Introduction Since its introduction close to 70 years ago, the stochastic approximation (SA) scheme (Robbins and Monro, 1951) has been a powerful tool for root finding when only noisy samples are available. During the past two decades, considerable progresses in the practical and theoretical research of SA have been made, see (Bena ım, 1999; Kushner and Yin, 2003; Borkar, 2008) for an overview. Among others, linear SA schemes are popular in reinforcement learning (RL) as they lead to policy evaluation methods with linear function approximation, of particular importance is temporal difference (TD) learning (Sutton, 1988) for which finite time analysis has been reported in (Srikant and Ying, 2019; Lakshminarayanan and Szepesvari, 2018; Bhandari et al., 2018; Dalal et al., 2018a). The TD learning scheme based on classical (linear) SA is known to be inadequate for the off-policy learning paradigms in RL, where data samples are drawn from a behavior policy different from the policy being evaluated (Baird, 1995; Tsitsiklis and V an Roy, 1997). To circumvent this Authors listed in alphabetical order. These methods fall within the scope of linear two-timescale SA scheme introduced by Borkar (1997): θ k 1 θ k β k{null b 1( X k 1) null A 11(X k 1)θ k null A 12(X k 1) w k}, (1) w k 1 w k γ k{null b 2( X k 1) null A 21( X k 1)θ k null A 22(X k 1)w k}.

artificial intelligence, null, survey article, (20 more...)

arXiv.org Machine Learning

2002.01268

Country: Europe > United Kingdom (0.27)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback