Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems
Ye, Junyi, Gu, Jingyi, Zhao, Xinyun, Yin, Wenpeng, Wang, Guiling
–arXiv.org Artificial Intelligence
The mathematical capabilities of AI systems are complex and multifaceted. Most existing research has predominantly focused on the correctness of AI-generated solutions to mathematical problems. In this work, we argue that beyond producing correct answers, AI systems should also be capable of, or assist humans in, developing novel solutions to mathematical challenges. This study explores the creative potential of Large Language Models (LLMs) in mathematical reasoning, an aspect that has received limited attention in prior research. Our experiments demonstrate that, while LLMs perform well on standard mathematical tasks, their capacity for creative problem-solving varies considerably. In recent years, artificial intelligence has made significant strides, particularly in the development of Large Language Models (LLMs) capable of tackling complex problem-solving tasks. Beyond solving student-oriented math problems, leading mathematicians have begun exploring the use of LLMs to assist in tackling unresolved mathematical challenges (Romera-Paredes et al., 2024; Trinh et al., 2024). Despite these models' success in achieving high accuracy on existing mathematical datasets, their potential for creative problem-solving remains largely underexplored. Mathematical creativity goes beyond solving problems correctly; it involves generating novel solutions, applying unconventional techniques, and offering deep insights--areas traditionally associated with human ingenuity. Yet, most studies have focused primarily on correctness and efficiency, paying little attention to the innovative approaches LLMs might employ. Furthermore, creativity in mathematical problem-solving is rarely integrated into existing benchmarks, limiting our understanding of LLMs' full potential.
arXiv.org Artificial Intelligence
Oct-23-2024
- Country:
- Europe (0.28)
- Genre:
- Research Report
- New Finding (0.93)
- Promising Solution (1.00)
- Research Report
- Industry:
- Education (0.93)
- Technology: