PEPS: Quantum-Inspired Reinforcement Learning for Coherent Reasoning Traces in LLMs

Open in new window