In between myth and reality: AI for math -- a case study in category theory

Apr-21-2025–arXiv.org Artificial Intelligence

Unsurprisingly, mathematics is one of the main benchmarks f or current AI. There are even significant efforts to build AI systems dedicated to mathematics. O ne of them is o3-mini [ 6 ], developed by OpenAI. It is claimed that it solved 80% of the subject shee t at American Invitational Mathematics Examination (AIME) 2024 [ 5 ], a prestigious competition leading to the USA Mathematica l Olympiad. Another one is Grok-3 [ 11 ], developed by xAI (an Elon Musk company), which is also claimed to be very good at math and physics. These claims are i n stark contrast with the statistics put forward in [ 3 ] where, in the case of mathematical research, the AI can solv e only 2% of the problems. Our motivation for this AI experiment was to try to understan d, from the perspective of a non-specialist in machine learning, what can this kind of AI do for the working mathematicians, how can we use it to support our work, and what is behind the vas t gap (claimed in [ 3 ]) between what current AI capabilities and the prowess of the mathemat ical research community. The readership target of this paper consists mainly of the ma thematicians with a moderate degree of fluency with elementary category-theoretic think ing.

inclusion system, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

Apr-21-2025

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom (0.28)
- North America > United States (0.24)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.68)
  - Natural Language > Large Language Model (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found