Controllable Mathematical Reasoning via Self-Optimizing Thought Vectors
–arXiv.org Artificial Intelligence
We present a novel approach for controllable mathematical reasoning that leverages self-optimizing thought vectors with entropy minimization. Our method introduces learnable thought vectors that dynamically modulate the internal reasoning process of large language models. Using Gemma-2-9B on GSM8K, we achieve 90.1% accuracy with a controllability score of 0.42, demonstrating that entropy-based rewards effectively guide focused reasoning patterns without requiring external reward annotations. Our analysis reveals distinct thought vector clusters and consistent low-entropy distributions across control conditions, validating our framework for controllable AI reasoning.
arXiv.org Artificial Intelligence
Oct-28-2025
- Genre:
- Research Report > Promising Solution (0.34)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Problem Solving (0.67)
- Machine Learning (1.00)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence