On-Policy Optimization of ANFIS Policies Using Proximal Policy Optimization
Shankar, Kaaustaaub, Louw, Wilhelm, Cohen, Kelly
–arXiv.org Artificial Intelligence
We present a reinforcement learning method for training neuro-fuzzy controllers using Proximal Policy Optimization (PPO). Unlike prior approaches that used Deep Q-Networks (DQN) with Adaptive Neuro-Fuzzy Inference Systems (ANFIS), our PPO-based framework leverages a stable on-policy actor-critic setup. Evaluated on the CartPole-v1 environment across multiple seeds, PPO-trained fuzzy agents consistently achieved the maximum return of 500 with zero variance after 20, 000 updates, outperforming ANFIS-DQN baselines in both stability and convergence speed. This highlights PPO's potential for training explainable neuro-fuzzy agents in reinforcement learning tasks.
arXiv.org Artificial Intelligence
Jul-8-2025
- Country:
- North America > United States > Ohio > Hamilton County > Cincinnati (0.05)
- Genre:
- Research Report (0.41)
- Technology: