Goto

Collaborating Authors

 Instructional Material


Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

arXiv.org Artificial Intelligence

Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. Our study investigates the application of advanced AI models, including Large Language Models (LLMs) like GPT-4, for predicting learning performance in adult literacy programs in ITSs. This research is motivated by the potential of LLMs to predict learning performance based on its inherent reasoning and computational capabilities. By using reading comprehension datasets from the ITS, AutoTutor, we evaluate the predictive capabilities of GPT-4 versus traditional machine learning methods in predicting learning performance through five-fold cross-validation techniques. Our findings show that the GPT-4 presents the competitive predictive abilities with traditional machine learning methods such as Bayesian Knowledge Tracing, Performance Factor Analysis, Sparse Factor Analysis Lite (SPARFA-Lite), tensor factorization and eXtreme Gradient Boosting (XGBoost). While XGBoost (trained on local machine) outperforms GPT-4 in predictive accuracy, GPT-4-selected XGBoost and its subsequent tuning on the GPT-4 platform demonstrates superior performance compared to local machine execution. Moreover, our investigation into hyper-parameter tuning by GPT-4 versus grid-search suggests comparable performance, albeit with less stability in the automated approach, using XGBoost as the case study. Our study contributes to the field by highlighting the potential of integrating LLMs with traditional machine learning models to enhance predictive accuracy and personalize adult literacy education, setting a foundation for future research in applying LLMs within ITSs.


A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing

arXiv.org Artificial Intelligence

The pretrain-finetune paradigm represents a transformative approach in natural language processing (NLP). This paradigm distinguishes itself through the use of large pretrained language models, demonstrating remarkable efficiency in finetuning tasks, even with limited training data. This efficiency is especially beneficial for research in social sciences, where the number of annotated samples is often quite limited. Our tutorial offers a comprehensive introduction to the pretrain-finetune paradigm. We first delve into the fundamental concepts of pretraining and finetuning, followed by practical exercises using real-world applications. We demonstrate the application of the paradigm across various tasks, including multi-class classification and regression. Emphasizing its efficacy and user-friendliness, the tutorial aims to encourage broader adoption of this paradigm. To this end, we have provided open access to all our code and datasets. The tutorial is particularly valuable for quantitative researchers in psychology, offering them an insightful guide into this innovative approach.


Applied Causal Inference Powered by ML and AI

arXiv.org Machine Learning

This book aims to provide a working introduction to the emerging fusion of modern statistical inference - aka machine learning (ML) or artificial intelligence (AI) - and causal inference methods. The book is aimed at upper level undergraduates and master's-level students as well as doctoral students focusing on applied empirical research. A sufficient background for the core material is one semester of introductory econometrics and one semester of machine learning. We hope the book is also useful to empirical researchers looking to apply modern methods in their work. The book provides an overview of key ideas in both predictive inference and causal inference and shows how predictive tools are key ingredients to answering many causal questions.


Transformers for Supervised Online Continual Learning

arXiv.org Artificial Intelligence

Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks that are not naturally sequential such as image classification. Their ability to attend to and to process a set of tokens as context enables them to develop in-context few-shot learning abilities. However, their potential for online continual learning remains relatively unexplored. In online continual learning, a model must adapt to a non-stationary stream of data, minimizing the cumulative nextstep prediction loss. We focus on the supervised online continual learning setting, where we learn a predictor $x_t \rightarrow y_t$ for a sequence of examples $(x_t, y_t)$. Inspired by the in-context learning capabilities of transformers and their connection to meta-learning, we propose a method that leverages these strengths for online continual learning. Our approach explicitly conditions a transformer on recent observations, while at the same time online training it with stochastic gradient descent, following the procedure introduced with Transformer-XL. We incorporate replay to maintain the benefits of multi-epoch training while adhering to the sequential protocol. We hypothesize that this combination enables fast adaptation through in-context learning and sustained longterm improvement via parametric learning. Our method demonstrates significant improvements over previous state-of-the-art results on CLOC, a challenging large-scale real-world benchmark for image geo-localization.


Exploring the Design of Generative AI in Supporting Music-based Reminiscence for Older Adults

arXiv.org Artificial Intelligence

Music-based reminiscence has the potential to positively impact the psychological well-being of older adults. However, the aging process and physiological changes, such as memory decline and limited verbal communication, may impede the ability of older adults to recall their memories and life experiences. Given the advanced capabilities of generative artificial intelligence (AI) systems, such as generated conversations and images, and their potential to facilitate the reminiscing process, this study aims to explore the design of generative AI to support music-based reminiscence in older adults. This study follows a user-centered design approach incorporating various stages, including detailed interviews with two social workers and two design workshops (involving ten older adults). Our work contributes to an in-depth understanding of older adults' attitudes toward utilizing generative AI for supporting music-based reminiscence and identifies concrete design considerations for the future design of generative AI to enhance the reminiscence experience of older adults.


SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

arXiv.org Artificial Intelligence

We study the problem of procedure planning in instructional videos, which aims to make a goal-oriented sequence of action steps given partial visual state observations. The motivation of this problem is to learn a structured and plannable state and action space. Recent works succeeded in sequence modeling of steps with only sequence-level annotations accessible during training, which overlooked the roles of states in the procedures. In this work, we point out that State CHangEs MAtter (SCHEMA) for procedure planning in instructional videos. We aim to establish a more structured state space by investigating the causal relations between steps and states in procedures. Specifically, we explicitly represent each step as state changes and track the state changes in procedures. For step representation, we leveraged the commonsense knowledge in large language models (LLMs) to describe the state changes of steps via our designed chain-of-thought prompting. For state change tracking, we align visual state observations with language state descriptions via cross-modal contrastive learning, and explicitly model the intermediate states of the procedure using LLM-generated state descriptions. Experiments on CrossTask, COIN, and NIV benchmark datasets demonstrate that our proposed SCHEMA model achieves state-of-the-art performance and obtains explainable visualizations. Humans are natural experts in procedure planning, i.e., arranging a sequence of instruction steps to achieve a specific goal. Procedure planning is an essential and fundamental reasoning ability for embodied AI systems and is crucial in complicated real-world problems like robotic navigation (Tellex et al., 2011; Jansen, 2020; Brohan et al., 2022). Instruction steps in procedural tasks are commonly state-modifying actions that induce state changes of objects. For example, for the task of "grilling steak", a raw steak would be first topped with pepper after "seasoning the steak", then placed on the grill before "closing the lid", and become cooked pieces after "cutting the steak". These before-states and after-states reflect fine-grained information like shape, color, size, and location of entities. Therefore, the planning agents need to figure out both the temporal relations between action steps and the causal relations between steps and states.


Enhancing Neural Machine Translation of Low-Resource Languages: Corpus Development, Human Evaluation and Explainable AI Architectures

arXiv.org Artificial Intelligence

In the current machine translation (MT) landscape, the Transformer architecture stands out as the gold standard, especially for high-resource language pairs. This research delves into its efficacy for low-resource language pairs including both the English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi language pairs. Notably, the study identifies the optimal hyperparameters and subword model type to significantly improve the translation quality of Transformer models for low-resource language pairs. The scarcity of parallel datasets for low-resource languages can hinder MT development. To address this, gaHealth was developed, the first bilingual corpus of health data for the Irish language. Focusing on the health domain, models developed using this in-domain dataset exhibited very significant improvements in BLEU score when compared with models from the LoResMT2021 Shared Task. A subsequent human evaluation using the multidimensional quality metrics error taxonomy showcased the superior performance of the Transformer system in reducing both accuracy and fluency errors compared to an RNN-based counterpart. Furthermore, this thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models. These tools considerably simplify the setup and evaluation process, making MT more accessible to both developers and translators. Notably, adaptNMT, grounded in the OpenNMT ecosystem, promotes eco-friendly natural language processing research by highlighting the environmental footprint of model development. Fine-tuning of MLLMs by adaptMLLM demonstrated advancements in translation performance for two low-resource language pairs: English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi, compared to baselines from the LoResMT2021 Shared Task.


SyllabusQA: A Course Logistics Question Answering Dataset

arXiv.org Artificial Intelligence

Moreover, text similarity metrics may not be suitable In educational applications, artificial intelligence for some open-ended natural language generation (AI) approaches have shown significant promise in tasks (Amidei et al., 2018). As an example, the answer improving learning outcomes (Aleven et al., 2016; "The final exam will be on Dec 15", has high VanLehn, 2011), by automatically providing feedback surface-level textual similarity with the reference to students or engaging in tutoring dialogues answer, "The final exam is on Dec 14", but contains with them. The key idea is to use AI to create an ondemand a critical factual error that may lead to significant virtual teaching assistant to interact with negative consequences to students. Meanwhile, many students simultaneously; see, e.g., Khamigo human instructors and teaching assistants often answer from Khan Academy (Academy, 2022). These approaches student questions in a concise way, without can scale up the effort of expert human giving any unnecessary information. Therefore, it teachers and tutors, and relieve them from doing is important for LLM-based approaches to generate repetitive tasks so that they can focus on providing answers that are both concise and precise.


ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data

arXiv.org Artificial Intelligence

ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data The prediction of student performance and the analysis of students' learning behavior play an important role in enhancing online courses. By analysing a massive amount of clickstream data that captures student behavior, educators can gain valuable insights into the factors that influence academic outcomes and identify areas of improvement in courses. In this study, we developed ClickTree, a tree-based methodology, to predict student performance in mathematical assignments based on students' clickstream data. We extracted a set of features, including problem-level, assignment-level and student-level features, from the extensive clickstream data and trained a CatBoost tree to predict whether a student successfully answers a problem in an assignment. The developed method achieved an AUC of 0.78844 in the Educational Data Mining Cup 2023 and ranked second in the competition. Furthermore, our results indicate that students encounter more difficulties in the problem types that they must select a subset of answers from a given set as well as problem subjects of Algebra II. Additionally, students who performed well in answering end-unit assignment problems engaged more with in-unit assignments and answered more problems correctly, while those who struggled had higher tutoring request rate. The proposed method can be utilized to improve students' learning experiences, and the above insights can be integrated into mathematical courses to enhance students' learning outcomes. In recent years, massive amounts of log data have been collected from students' interactions with online courses, providing researchers with valuable information to analyze student behavior and its impact on academic performance (Yi et al., 2018; Aljohani et al., 2019). By examining clickstream data, educators can gain deeper insights into students' study habits, navigation patterns, and levels of engagement (Wen and Rosé, 2014; Li et al., 2020; Matcha et al., 2020).


VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

arXiv.org Artificial Intelligence

Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from radio astronomers. Addressing this challenge, we propose VisRec, a model-agnostic semi-supervised learning approach to the reconstruction of visibility data. Specifically, VisRec consists of both a supervised learning module and an unsupervised learning module. In the supervised learning module, we introduce a set of data augmentation functions to produce diverse training examples. In comparison, the unsupervised learning module in VisRec augments unlabeled data and uses reconstructions from non-augmented visibility data as pseudo-labels for training. This hybrid approach allows VisRec to effectively leverage both labeled and unlabeled data. This way, VisRec performs well even when labeled data is scarce. Our evaluation results show that VisRec outperforms all baseline methods in reconstruction quality, robustness against common observation perturbation, and generalizability to different telescope configurations.