Physics-Informed Critic in an Actor-Critic Reinforcement Learning for Swimming in Turbulence
Koh, Christopher, Pagnier, Laurent, Chertkov, Michael
In this manuscript, we consider a particle in a turbulent flow that swims towards its passive partner to maintain proximity. The particle is controlled by a Reinforcement Learning (RL) agent [1], a methodology in Artificial Intelligence (AI) for solving complex decision-making problems. Unlike other AI methods, RL involves an agent learning through interaction with its environment, balancing exploration and exploitation. Exploration involves trying new actions to gain information about the environment (turbulence), while exploitation uses accumulated knowledge to make optimal decisions. This RL decision-making is linked to the Stochastic Optimal Control (SOC) challenge, where the agent maximizes expected reward under environmental uncertainty. In this study, the reward consists of two competing terms: maintaining distance between the agent and its partner, and penalizing the effort required. Among RL strategies, Actor-Critic (AC) methods [2] combine policy-based actors with reward-based critics. The "actor" suggests actions based on current policy, and the "critic" evaluates these actions, providing feedback to update the policy and reduce learning variance.
Jun-5-2024
- Country:
- North America > United States > Arizona > Pima County > Tucson (0.14)
- Genre:
- Research Report > New Finding (0.66)
- Technology: