MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
Shen, Jia Tracy, Yamashita, Michiharu, Prihar, Ethan, Heffernan, Neil, Wu, Xintao, Lee, Dongwon
–arXiv.org Artificial Intelligence
Due to the transfer learning nature of BERT model, researchers have achieved better performance than base BERT by further pre-training the original BERT on a huge domain-specific corpus. Due to the special nature of mathematical texts which often contain math equations and symbols, the original BERT model pre-trained on general English context will not fit Natural Language Processing (NLP) tasks in mathematical education well. Therefore, we propose MathBERT, a BERT pre-trained on large mathematical corpus including pre-k to graduate level mathematical content to tackle math-specific tasks. In addition, We generate a customized mathematical vocabulary to pre-train with MathBERT and compare the performance to the MathBERT pre-trained with the original BERT vocabulary. We select three important tasks in mathematical education such as knowledge component, auto-grading, and knowledge tracing prediction to evaluate the performance of MathBERT. Our experiments show that MathBERT outperforms the base BERT by 2-9\% margin. In some cases, MathBERT pre-trained with mathematical vocabulary is better than MathBERT trained with original vocabulary.To our best knowledge, MathBERT is the first pre-trained model for general purpose mathematics education tasks.
arXiv.org Artificial Intelligence
Jun-1-2021
- Country:
- North America > United States (0.69)
- Genre:
- Industry:
- Education
- Curriculum > Subject-Specific Education (0.35)
- Educational Setting > Online (0.46)
- Educational Technology > Educational Software (0.47)
- Education
- Technology: