On-Policy Optimization of ANFIS Policies Using Proximal Policy Optimization

Shankar, Kaaustaaub, Louw, Wilhelm, Cohen, Kelly

arXiv.org Artificial Intelligence 

We present a reinforcement learning method for training neuro-fuzzy controllers using Proximal Policy Optimization (PPO). Unlike prior approaches that used Deep Q-Networks (DQN) with Adaptive Neuro-Fuzzy Inference Systems (ANFIS), our PPO-based framework leverages a stable on-policy actor-critic setup. Evaluated on the CartPole-v1 environment across multiple seeds, PPO-trained fuzzy agents consistently achieved the maximum return of 500 with zero variance after 20, 000 updates, outperforming ANFIS-DQN baselines in both stability and convergence speed. This highlights PPO's potential for training explainable neuro-fuzzy agents in reinforcement learning tasks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found