AITopics | Feng, Wanyong

Collaborating Authors

Feng, Wanyong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reasoning and Sampling-Augmented MCQ Difficulty Prediction via LLMs

Feng, Wanyong, Tran, Peter, Sireci, Stephen, Lan, Andrew

arXiv.org Artificial IntelligenceMar-11-2025

The difficulty of multiple-choice questions (MCQs) is a crucial factor for educational assessments. Predicting MCQ difficulty is challenging since it requires understanding both the complexity of reaching the correct option and the plausibility of distractors, i.e., incorrect options. In this paper, we propose a novel, two-stage method to predict the difficulty of MCQs. First, to better estimate the complexity of each MCQ, we use large language models (LLMs) to augment the reasoning steps required to reach each option. We use not just the MCQ itself but also these reasoning steps as input to predict the difficulty. Second, to capture the plausibility of distractors, we sample knowledge levels from a distribution to account for variation among students responding to the MCQ. This setup, inspired by item response theory (IRT), enable us to estimate the likelihood of students selecting each (both correct and incorrect) option. We align these predictions with their ground truth values, using a Kullback-Leibler (KL) divergence-based regularization objective, and use estimated likelihoods to predict MCQ difficulty. We evaluate our method on two real-world \emph{math} MCQ and response datasets with ground truth difficulty values estimated using IRT. Experimental results show that our method outperforms all baselines, up to a 28.3\% reduction in mean squared error and a 34.6\% improvement in the coefficient of determination. We also qualitatively discuss how our novel method results in higher accuracy in predicting MCQ difficulty.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.08551

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Education > Assessment & Standards (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics

Lee, Jaewook, Lee, Jeongah, Feng, Wanyong, Lan, Andrew

arXiv.org Artificial IntelligenceMar-10-2025

Advances in large language models (LLMs) offer new possibilities for enhancing math education by automating support for both teachers and students. While prior work has focused on generating math problems and high-quality distractors, the role of visualization in math learning remains under-explored. Diagrams are essential for mathematical thinking and problem-solving, yet manually creating them is time-consuming and requires domain-specific expertise, limiting scalability. Recent research on using LLMs to generate Scalable Vector Graphics (SVG) presents a promising approach to automating diagram creation. Unlike pixel-based images, SVGs represent geometric figures using XML, allowing seamless scaling and adaptability. Educational platforms such as Khan Academy and IXL already use SVGs to display math problems and hints. In this paper, we explore the use of LLMs to generate math-related diagrams that accompany textual hints via intermediate SVG representations. We address three research questions: (1) how to automatically generate math diagrams in problem-solving hints and evaluate their quality, (2) whether SVG is an effective intermediate representation for math diagrams, and (3) what prompting strategies and formats are required for LLMs to generate accurate SVG-based diagrams. Our contributions include defining the task of automatically generating SVG-based diagrams for math hints, developing an LLM prompting-based pipeline, and identifying key strategies for improving diagram generation. Additionally, we introduce a Visual Question Answering-based evaluation setup and conduct ablation studies to assess different pipeline variations. By automating the math diagram creation, we aim to provide students and teachers with accurate, conceptually relevant visual aids that enhance problem-solving and learning experiences.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.07429

Country: North America > United States > Massachusetts (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank

Scarlatos, Alexander, Feng, Wanyong, Smith, Digory, Woodhead, Simon, Lan, Andrew

arXiv.org Artificial IntelligenceMay-13-2024

Multiple-choice questions (MCQs) are commonly used across all levels of math education since they can be deployed and graded at a large scale. A critical component of MCQs is the distractors, i.e., incorrect answers crafted to reflect student errors or misconceptions. Automatically generating them in math MCQs, e.g., with large language models, has been challenging. In this work, we propose a novel method to enhance the quality of generated distractors through overgenerate-and-rank, training a ranking model to predict how likely distractors are to be selected by real students. Experimental results on a real-world dataset and human evaluation with math teachers show that our ranking model increases alignment with human-authored distractors, although human-authored ones are still preferred over generated ones.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.05144

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models

Feng, Wanyong, Lee, Jaewook, McNichols, Hunter, Scarlatos, Alexander, Smith, Digory, Woodhead, Simon, Ornelas, Nancy Otero, Lan, Andrew

arXiv.org Artificial IntelligenceApr-18-2024

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in assessments and practices. One of the most important aspects of MCQs is the distractors, i.e., incorrect options that are designed to target common errors or misconceptions among real students. To date, the task of crafting high-quality distractors largely remains a labor and time-intensive process for teachers and learning content designers, which has limited scalability. In this work, we study the task of automated distractor generation in the domain of math MCQs and explore a wide variety of large language model (LLM)-based approaches, from in-context learning to fine-tuning. We conduct extensive experiments using a real-world math MCQ dataset and find that although LLMs can generate some mathematically valid distractors, they are less adept at anticipating common errors or misconceptions among real students.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.02124

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning

McNichols, Hunter, Feng, Wanyong, Lee, Jaewook, Scarlatos, Alexander, Smith, Digory, Woodhead, Simon, Lan, Andrew

arXiv.org Artificial IntelligenceJan-11-2024

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable form of assessment. An important aspect of MCQs is the distractors, i.e., incorrect options that are designed to target specific misconceptions or insufficient knowledge among students. To date, the task of crafting high-quality distractors has largely remained a labor-intensive process for teachers and learning content designers, which has limited scalability. In this work, we explore the task of automated distractor and corresponding feedback message generation in math MCQs using large language models. We establish a formulation of these two tasks and propose a simple, in-context learning-based solution. Moreover, we propose generative AI-based metrics for evaluating the quality of the feedback messages. We conduct extensive experiments on these tasks using a real-world MCQ dataset. Our findings suggest that there is a lot of room for improvement in automated distractor and feedback generation; based on these findings, we outline several directions for future work.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2308.03234

Country:

Europe (1.00)
North America > United States > Massachusetts (0.14)
North America > United States > Michigan (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Education > Educational Technology > Educational Software (0.68)
Education > Curriculum > Subject-Specific Education (0.68)
Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

A Conceptual Model for End-to-End Causal Discovery in Knowledge Tracing

Kumar, Nischal Ashok, Feng, Wanyong, Lee, Jaewook, McNichols, Hunter, Ghosh, Aritra, Lan, Andrew

arXiv.org Artificial IntelligenceJul-18-2023

In this paper, we take a preliminary step towards solving the problem of causal discovery in knowledge tracing, i.e., finding the underlying causal relationship among different skills from real-world student response data. This problem is important since it can potentially help us understand the causal relationship between different skills without extensive A/B testing, which can potentially help educators to design better curricula according to skill prerequisite information. Specifically, we propose a conceptual solution, a novel causal gated recurrent unit (GRU) module in a modified deep knowledge tracing model, which uses i) a learnable permutation matrix for causal ordering among skills and ii) an optionally learnable lower-triangular matrix for causal structure among skills. We also detail how to learn the model parameters in an end-to-end, differentiable way. Our solution placed among the top entries in Task 3 of the NeurIPS 2022 Challenge on Causal Insights for Learning Paths in Education. We detail preliminary experiments as evaluated on the challenge's public leaderboard since the ground truth causal structure has not been publicly released, making detailed local evaluation impossible.

artificial intelligence, data mining, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.16165

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (0.68)
Education > Educational Technology > Educational Software (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Balancing Test Accuracy and Security in Computerized Adaptive Testing

Feng, Wanyong, Ghosh, Aritra, Sireci, Stephen, Lan, Andrew S.

arXiv.org Artificial IntelligenceMay-18-2023

Computerized adaptive testing (CAT) is a form of personalized testing that accurately measures students' knowledge levels while reducing test length. Bilevel optimization-based CAT (BOBCAT) is a recent framework that learns a data-driven question selection algorithm to effectively reduce test length and improve test accuracy. However, it suffers from high question exposure and test overlap rates, which potentially affects test security. This paper introduces a constrained version of BOBCAT to address these problems by changing its optimization setup and enabling us to trade off test accuracy for question exposure and test overlap rates. We show that C-BOBCAT is effective through extensive experiments on two real-world adult testing datasets.

artificial intelligence, machine learning, question selection algorithm, (9 more...)

arXiv.org Artificial Intelligence

2305.18312

Country: North America > United States > Massachusetts (0.15)

Genre: Research Report (0.64)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback