Instructional Material
Leveraging Language Models and Automatic Summarization in Online Programming Learning Environments
Objective B. Messages in the forum are summarized using automatic natural language techniques. These summaries are intended to help students identify the errors they are having and improve their ability to ask for help. When they use the selected recommendations, students will be provided with strategies to learn how to ask better questions iteratively. "Improving question formulation" represents acquiring a deeper understanding of the topic and emerges as a key strategy for advancing programming learning. The automatic summarization was implemented by adapting a technique known as TextRank8 to the domain of the Mumuki forum in Spanish.
Biomedical Visual Instruction Tuning with Clinician Preference Alignment
Cui, Hejie, Mao, Lingjun, Liang, Xin, Zhang, Jieyu, Ren, Hui, Li, Quanzheng, Li, Xiang, Yang, Carl
Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the resultant datasets are not explicitly aligned with domain expertise. In this work, we propose a data-centric framework, Biomedical Visual Instruction Tuning with Clinician Preference Alignment (BioMed-VITAL), that incorporates clinician preferences into both stages of generating and selecting instruction data for tuning biomedical multimodal foundation models. First, during the generation stage, we prompt the GPT-4V generator with a diverse set of clinician-selected demonstrations for preference-aligned data candidate generation. Then, during the selection phase, we train a separate selection model, which explicitly distills clinician and policy-guided model preferences into a rating function to select high-quality data for medical instruction tuning. Results show that the model tuned with the instruction-following data from our method demonstrates a significant improvement in open visual chat (18.5% relatively) and medical VQA (win rate up to 81.73%). Our instruction-following data and models are available at BioMed-VITAL.github.io.
The Future of Data Science Education
Wright, Brian, Alonzi, Peter, Riveria, Ali
The definition of Data Science is a hotly debated topic. For many, the definition is a simple shortcut to Artificial Intelligence or Machine Learning. However, there is far more depth and nuance to the field of Data Science than a simple shortcut can provide. The School of Data Science at the University of Virginia has developed a novel model for the definition of Data Science. This model is based on identifying a unified understanding of the data work done across all areas of Data Science. It represents a generational leap forward in how we understand and teach Data Science. In this paper we will present the core features of the model and explain how it unifies various concepts going far beyond the analytics component of AI. From this foundation we will present our Undergraduate Major curriculum in Data Science and demonstrate how it prepares students to be well-rounded Data Science team members and leaders. The paper will conclude with an in-depth overview of the Foundations of Data Science course designed to introduce students to the field while also implementing proven STEM oriented pedagogical methods. These include, for example, specifications grading, active learning lectures, guest lectures from industry experts and weekly gamification labs.
Satisficing Exploration for Deep Reinforcement Learning
Arumugam, Dilip, Kumar, Saurabh, Gummadi, Ramki, Van Roy, Benjamin
A default assumption in the design of reinforcement-learning algorithms is that a decision-making agent always explores to learn optimal behavior. In sufficiently complex environments that approach the vastness and scale of the real world, however, attaining optimal performance may in fact be an entirely intractable endeavor and an agent may seldom find itself in a position to complete the requisite exploration for identifying an optimal policy. Recent work has leveraged tools from information theory to design agents that deliberately forgo optimal solutions in favor of sufficiently-satisfying or satisficing solutions, obtained through lossy compression. Notably, such agents may employ fundamentally different exploratory decisions to learn satisficing behaviors more efficiently than optimal ones that are more data intensive. While supported by a rigorous corroborating theory, the underlying algorithm relies on model-based planning, drastically limiting the compatibility of these ideas with function approximation and high-dimensional observations. In this work, we remedy this issue by extending an agent that directly represents uncertainty over the optimal value function allowing it to both bypass the need for model-based planning and to learn satisficing policies. We provide simple yet illustrative experiments that demonstrate how our algorithm enables deep reinforcement-learning agents to achieve satisficing behaviors. In keeping with previous work on this setting for multi-armed bandits, we additionally find that our algorithm is capable of synthesizing optimal behaviors, when feasible, more efficiently than its non-information-theoretic counterpart.
ER-FSL: Experience Replay with Feature Subspace Learning for Online Continual Learning
Online continual learning (OCL) involves deep neural networks retaining knowledge from old data while adapting to new data, which is accessible only once. A critical challenge in OCL is catastrophic forgetting, reflected in reduced model performance on old data. Existing replay-based methods mitigate forgetting by replaying buffered samples from old data and learning current samples of new data. In this work, we dissect existing methods and empirically discover that learning and replaying in the same feature space is not conducive to addressing the forgetting issue. Since the learned features associated with old data are readily changed by the features related to new data due to data imbalance, leading to the forgetting problem. Based on this observation, we intuitively explore learning and replaying in different feature spaces. Learning in a feature subspace is sufficient to capture novel knowledge from new data while replaying in a larger feature space provides more feature space to maintain historical knowledge from old data. To this end, we propose a novel OCL approach called experience replay with feature subspace learning (ER-FSL). Firstly, ER-FSL divides the entire feature space into multiple subspaces, with each subspace used to learn current samples. Moreover, it introduces a subspace reuse mechanism to address situations where no blank subspaces exist. Secondly, ER-FSL replays previous samples using an accumulated space comprising all learned subspaces. Extensive experiments on three datasets demonstrate the superiority of ER-FSL over various state-of-the-art methods.
Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification
Alkhunaizi, Naif, Almalik, Faris, Al-Refai, Rouqaiah, Naseer, Muzammal, Nandakumar, Karthik
With the advent of large pre-trained transformer models, fine-tuning these models for various downstream tasks is a critical problem. Paucity of training data, the existence of data silos, and stringent privacy constraints exacerbate this fine-tuning problem in the medical imaging domain, creating a strong need for algorithms that enable collaborative fine-tuning of pre-trained models. Moreover, the large size of these models necessitates the use of parameter-efficient fine-tuning (PEFT) to reduce the communication burden in federated learning. In this work, we systematically investigate various federated PEFT strategies for adapting a Vision Transformer (ViT) model (pre-trained on a large natural image dataset) for medical image classification. Apart from evaluating known PEFT techniques, we introduce new federated variants of PEFT algorithms such as visual prompt tuning (VPT), low-rank decomposition of visual prompts, stochastic block attention fine-tuning, and hybrid PEFT methods like low-rank adaptation (LoRA)+VPT. Moreover, we perform a thorough empirical analysis to identify the optimal PEFT method for the federated setting and understand the impact of data distribution on federated PEFT, especially for out-of-domain (OOD) and non-IID data. The key insight of this study is that while most federated PEFT methods work well for in-domain transfer, there is a substantial accuracy vs. efficiency trade-off when dealing with OOD and non-IID scenarios, which is commonly the case in medical imaging. Specifically, every order of magnitude reduction in fine-tuned/exchanged parameters can lead to a 4% drop in accuracy. Thus, the initial model choice is crucial for federated PEFT. It is preferable to use medical foundation models learned from in-domain medical image data (if available) rather than general vision models.
How to install the macOS Sequoia public beta
About a month after Apple announced it at WWDC 2024, macOS Sequoia is available to test-drive as a public beta. Although we don't recommend installing it on your primary Mac, here's how to get the 2024 version of macOS up and running ahead of its official rollout in the fall. First, you'll need a recent Mac to run the Sequoia public beta. Apple's software supports the following models: You'll notice that list still includes (up to) the last few generations of Intel Macs, so Apple may still be several years away from requiring Apple Silicon for its latest software. However, Apple Intelligence, which isn't yet included in the beta, will require a Mac with an M-series chip when it's available. Macs don't have automatic iCloud system backups like iOS devices, so you'll want to back up your Mac with Time Machine before installing.
Automated essay scoring in Arabic: a dataset and analysis of a BERT-based system
Ghazawi, Rayed, Simpson, Edwin
Automated Essay Scoring (AES) holds significant promise in the field of education, helping educators to mark larger volumes of essays and provide timely feedback. However, Arabic AES research has been limited by the lack of publicly available essay data. This study introduces AR-AES, an Arabic AES benchmark dataset comprising 2046 undergraduate essays, including gender information, scores, and transparent rubric-based evaluation guidelines, providing comprehensive insights into the scoring process. These essays come from four diverse courses, covering both traditional and online exams. Additionally, we pioneer the use of AraBERT for AES, exploring its performance on different question types. We find encouraging results, particularly for Environmental Chemistry and source-dependent essay questions. For the first time, we examine the scale of errors made by a BERT-based AES system, observing that 96.15 percent of the errors are within one point of the first human marker's prediction, on a scale of one to five, with 79.49 percent of predictions matching exactly. In contrast, additional human markers did not exceed 30 percent exact matches with the first marker, with 62.9 percent within one mark. These findings highlight the subjectivity inherent in essay grading, and underscore the potential for current AES technology to assist human markers to grade consistently across large classes.
Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students
Švábenský, Valdemar, Verger, Mélina, Rodrigo, Maria Mercedes T., Monterozo, Clarence James G., Baker, Ryan S., Saavedra, Miguel Zenon Nicanor Lerias, Lallé, Sébastien, Shimada, Atsushi
Algorithmic bias is a major issue in machine learning models in educational contexts. However, it has not yet been studied thoroughly in Asian learning contexts, and only limited work has considered algorithmic bias based on regional (sub-national) background. As a step towards addressing this gap, this paper examines the population of 5,986 students at a large university in the Philippines, investigating algorithmic bias based on students' regional background. The university used the Canvas learning management system (LMS) in its online courses across a broad range of domains. Over the period of three semesters, we collected 48.7 million log records of the students' activity in Canvas. We used these logs to train binary classification models that predict student grades from the LMS activity. The best-performing model reached AUC of 0.75 and weighted F1-score of 0.79. Subsequently, we examined the data for bias based on students' region. Evaluation using three metrics: AUC, weighted F1-score, and MADD showed consistent results across all demographic groups. Thus, no unfairness was observed against a particular student group in the grade predictions.
Artificial Intelligence from Idea to Implementation. How Can AI Reshape the Education Landscape?
This introductory chapter provides an overview of the evolution and impact of Artificial Intelligence (AI) technologies in today's society. Beginning with a historical context while exploring a few general definitions of AI, the author provides a timeline of the used technologies, highlighting its periods of stagnation, commonly referred to as "AI winters," and the subsequent resurgence fueled by relentless enthusiasm and investment. The narrative then transitions to focus on the transformative effects of AI on society at large, with a particular emphasis on educational applications. Through examples, the paper shows how AI technologies have moved from theoretical constructs to practical tools that are reshaping pedagogical approaches and student engagement. The essay concludes by discussing the prospects of AI in education, emphasizing the need for a balanced approach that considers both technological advancements and societal implications. Introduction We have learned from our mistakes throughout history to adapt to a hostile environment. For example, after inventing fire, which often got out of control, we went on to invent fire extinguishers, fire alarms, and develop fire services. Similarly, the invention of gunpowder and firearms led to the creation of bulletproof vests and armor-plated vehicles and the development of guard and protection services. The invention of cars was followed by the introduction of seat belts, airbags, and, more recently, self-driving automobiles. It is safe to say that technology is an expression of human will. Through technological advancements, we seek to extend our control over various aspects of our environment - be it distance, nature, or even interpersonal dynamics. Each of the tools we developed possesses the power to influence our perspectives and shape the future (Vrabie & Eduard, 2018; Vrabie, 2016). For example, farming tools have revolutionized agricultural practices, and lab instruments have opened new frontiers for scientists. Books, maps, and similar devices, often called "intellectual technologies" (Goody & Bell, 1975), have expanded our world understanding. These last ones, in particular, have had the most significant impact on society as we know it.