MATHWELL: Generating Age-Appropriate Educational Math Word Problems
Christ, Bryan R, Kropko, Jonathan, Hartvigsen, Thomas
–arXiv.org Artificial Intelligence
Math word problems are critical K-8 educational tools, but writing them is time-consuming and requires domain expertise. We suggest that language models can support K-8 math education by automatically generating problems. To be educational, generated problems must be 1) solvable, 2) accurate, and 3) appropriate. Existing datasets are unlabeled for these criteria, making them ill-suited for training problem generators. To address this gap, we use domain expert annotation to curate a high-quality synthetic training dataset for this task. We show the value of this data by using it to iteratively finetune Llama-2 (70B) to create MATHWELL, a K-8 word problem generator. Domain experts find MATHWELL has a 40% higher share of problems that have executable solutions and meet all criteria than existing open-source models, with 74% of its problems with executable solutions being solvable, accurate, and appropriate. MATHWELL achieves 94.9% of GPT-4 Turbo's performance on this task while outputting problems written at a more appropriate reading level for K-8 students. MATHWELL's performance despite being trained by finetuning only highlights the quality of our synthetic data for training age-appropriate word problem generators. We release our model, data, and annotations.
arXiv.org Artificial Intelligence
Apr-16-2024
- Country:
- Africa > Middle East
- Egypt (0.04)
- Asia
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Middle East
- Jordan (0.04)
- Republic of Türkiye > Batman Province
- Batman (0.04)
- Singapore (0.04)
- Japan > Honshū
- Europe
- Monaco (0.04)
- Switzerland (0.04)
- North America > United States
- Maine > Kennebec County
- Waterville (0.04)
- New York (0.04)
- Texas > Travis County
- Austin (0.04)
- Virginia (0.04)
- Maine > Kennebec County
- Africa > Middle East
- Genre:
- Research Report > Experimental Study (0.46)
- Industry:
- Education
- Leisure & Entertainment
- Games > Computer Games (0.93)
- Sports > Basketball (1.00)
- Media (0.93)
- Technology: