End-to-end Algorithm Synthesis with Recurrent Networks: Extrapolation without Overthinking

Jan-15-2025, 16:37:14 GMT–Neural Information Processing Systems

Machine learning systems perform well on pattern matching tasks, but their ability to perform algorithmic or logical reasoning is not well understood. One important reasoning capability is algorithmic extrapolation, in which models trained only on small/simple reasoning problems can synthesize complex strategies for large/complex problems at test time. Algorithmic extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems. We observe that this approach fails to scale to highly complex problems because behavior degenerates when many iterations are applied -- an issue we refer to as "overthinking." We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten.

end-to-end algorithm synthesis, extrapolation, recurrent network, (4 more...)

Neural Information Processing Systems

Jan-15-2025, 16:37:14 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Cognitive Science > Problem Solving (0.64)