mathbert
Autograding Mathematical Induction Proofs with Natural Language Processing
Zhao, Chenyan, Silva, Mariana, Poulsen, Seth
Writing mathematical proofs has been identified as an important [1-3] and yet challenging topic [4] in computing education and mathematics education. A large body of research has shown that timely feedback is crucial to student learning [5, 6]. However, students are largely unable to receive timely feedback on written proofs due to the need to have proofs collected and hand-graded by instructors or teaching assistants. The ability to grade student proofs fully automatically with natural language processing (NLP) alleviates this need by allowing us to give students instant feedback on their proofs to let students iteratively enhance the quality of their proofs. In this paper, we propose a novel set of training methods and models capable of autograding freeform mathematical proofs, a problem at the intersection of mathematical proof education and Automatic Short Answer Grading (ASAG), by using existing NLP models and other machine learning techniques. Our proof autograder enables the development of grading systems that provide instant feedback to students without needing attention from instructors. It can also be deployed in large-scale educational platforms, allowing for more access for students. The main contributions of this paper are: Introducing the first pipeline of machine learning models capable of autograding mathematical proofs with similar accuracy to human graders Quantifying the amount of training data needed to achieve a satisfactory performance from the grading models Publishing an anonymized and labeled mathematical proof dataset that can be used in future model developments [7] Creating a set of autograded problems using the grading pipeline, and performing a user study that answers the following research questions: - Are students able to write better proofs by interacting with the autograder and the feedback it generates?
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Utah (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study > Negative Result (0.48)
- Education > Curriculum > Subject-Specific Education (0.88)
- Education > Educational Setting (0.66)
- Education > Assessment & Standards (0.66)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
A Symbolic Framework for Systematic Evaluation of Mathematical Reasoning with Transformers
Meadows, Jordan, Valentino, Marco, Teney, Damien, Freitas, Andre
Whether Transformers can learn to apply symbolic rules and generalise to out-of-distribution examples is an open research question. In this paper, we devise a data generation method for producing intricate mathematical derivations, and systematically perturb them with respect to syntax, structure, and semantics. Our task-agnostic approach generates equations, annotations, and inter-equation dependencies, employing symbolic algebra for scalable data production and augmentation. We then instantiate a general experimental framework on next-equation prediction, assessing systematic mathematical reasoning and generalisation of Transformer encoders on a total of 200K examples. The experiments reveal that perturbations heavily affect performance and can reduce F1 scores of $97\%$ to below $17\%$, suggesting that inference is dominated by surface-level patterns unrelated to a deeper understanding of mathematical operators. These findings underscore the importance of rigorous, large-scale evaluation frameworks for revealing fundamental limitations of existing models.
- Asia > Middle East > Jordan (0.05)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
- Europe > Switzerland (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
End-to-End Evaluation of a Spoken Dialogue System for Learning Basic Mathematics
Okur, Eda, Sahay, Saurav, Alba, Roddy Fuentes, Nachman, Lama
The advances in language-based Artificial Intelligence (AI) technologies applied to build educational applications can present AI for social-good opportunities with a broader positive impact. Across many disciplines, enhancing the quality of mathematics education is crucial in building critical thinking and problem-solving skills at younger ages. Conversational AI systems have started maturing to a point where they could play a significant role in helping students learn fundamental math concepts. This work presents a task-oriented Spoken Dialogue System (SDS) built to support play-based learning of basic math concepts for early childhood education. The system has been evaluated via real-world deployments at school while the students are practicing early math concepts with multimodal interactions. We discuss our efforts to improve the SDS pipeline built for math learning, for which we explore utilizing MathBERT representations for potential enhancement to the Natural Language Understanding (NLU) module. We perform an end-to-end evaluation using real-world deployment outputs from the Automatic Speech Recognition (ASR), Intent Recognition, and Dialogue Manager (DM) components to understand how error propagation affects the overall performance in real-world scenarios.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (17 more...)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)
- Education > Curriculum > Subject-Specific Education (0.66)
MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
Shen, Jia Tracy, Yamashita, Michiharu, Prihar, Ethan, Heffernan, Neil, Wu, Xintao, Lee, Dongwon
Due to the transfer learning nature of BERT model, researchers have achieved better performance than base BERT by further pre-training the original BERT on a huge domain-specific corpus. Due to the special nature of mathematical texts which often contain math equations and symbols, the original BERT model pre-trained on general English context will not fit Natural Language Processing (NLP) tasks in mathematical education well. Therefore, we propose MathBERT, a BERT pre-trained on large mathematical corpus including pre-k to graduate level mathematical content to tackle math-specific tasks. In addition, We generate a customized mathematical vocabulary to pre-train with MathBERT and compare the performance to the MathBERT pre-trained with the original BERT vocabulary. We select three important tasks in mathematical education such as knowledge component, auto-grading, and knowledge tracing prediction to evaluate the performance of MathBERT. Our experiments show that MathBERT outperforms the base BERT by 2-9\% margin. In some cases, MathBERT pre-trained with mathematical vocabulary is better than MathBERT trained with original vocabulary.To our best knowledge, MathBERT is the first pre-trained model for general purpose mathematics education tasks.
- Education > Educational Technology > Educational Software (0.47)
- Education > Educational Setting > Online (0.46)
- Education > Curriculum > Subject-Specific Education (0.35)
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
Peng, Shuai, Yuan, Ke, Gao, Liangcai, Tang, Zhi
Large-scale pre-trained models like BERT, have obtained a great success in various Natural Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math-related tasks. Current pre-trained models neglect the structural features and the semantic correspondence between formula and its context. To address these issues, we propose a novel pre-trained model, namely \textbf{MathBERT}, which is jointly trained with mathematical formulas and their corresponding contexts. In addition, in order to further capture the semantic-level structural features of formulas, a new pre-training task is designed to predict the masked formula substructures extracted from the Operator Tree (OPT), which is the semantic structural representation of formulas. We conduct various experiments on three downstream tasks to evaluate the performance of MathBERT, including mathematical information retrieval, formula topic classification and formula headline generation. Experimental results demonstrate that MathBERT significantly outperforms existing methods on all those three tasks. Moreover, we qualitatively show that this pre-trained model effectively captures the semantic-level structural information of formulas. To the best of our knowledge, MathBERT is the first pre-trained model for mathematical formula understanding.