Goto

Collaborating Authors

 Instructional Material


Welcome to Harvard, where you can spend 317,800 to learn about 'queering the world,' threesome dating apps

FOX News

Harvard University offers a behemoth of courses that teach its students topics including "Queering Education," "Black Radicalism" and sexual fetishes. However, its course catalog – while offering many topics some would consider strongly critical of America – shows it does not offer significant courses focusing on American patriotism in depth despite taking in hundreds of millions of taxpayer dollars every year. In 2021, Harvard received 625 million from American taxpayers, all the while the Ivy League boasts over 50 billion in its endowment. Some companies and prospective students are starting to question their interest in Harvard, particularly after scandals relating to alleged pervasive antisemitism and pro-Hamas sentiment on its campus – prompting legal action and a civil rights investigation from the U.S. Department of Education. Harvard's education department for prospective K-12 teachers elaborates on how one can bring queerness and transgenderism into schools.


How Teachers Can Use Large Language Models and Bloom's Taxonomy to Create Educational Quizzes

arXiv.org Artificial Intelligence

Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting.


WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation

arXiv.org Artificial Intelligence

Recent work demonstrates that, after being fine-tuned on a high-quality instruction dataset, the resulting model can obtain impressive capabilities to address a wide range of tasks. However, existing methods for instruction data generation often produce duplicate data and are not controllable enough on data quality. In this paper, we extend the generalization of instruction tuning by classifying the instruction data to 4 code-related tasks and propose a LLM-based Generator-Discriminator data process framework to generate diverse, high-quality instruction data from open source code. Hence, we introduce CodeOcean, a dataset comprising 20,000 instruction instances across 4 universal code-related tasks,which is aimed at augmenting the effectiveness of instruction tuning and improving the generalization ability of fine-tuned model. Subsequently, we present WaveCoder, a fine-tuned Code LLM with Widespread And Versatile Enhanced instruction tuning. This model is specifically designed for enhancing instruction tuning of Code Language Models (LLMs). Our experiments demonstrate that Wavecoder models outperform other open-source models in terms of generalization ability across different code-related tasks at the same level of fine-tuning scale. Moreover, Wavecoder exhibits high efficiency in previous code generation tasks. This paper thus offers a significant contribution to the field of instruction data generation and fine-tuning models, providing new insights and tools for enhancing performance in code-related tasks.


A Deep Learning Approach Towards Student Performance Prediction in Online Courses: Challenges Based on a Global Perspective

arXiv.org Artificial Intelligence

Analyzing and evaluating students' progress in any learning environment is stressful and time consuming if done using traditional analysis methods. This is further exasperated by the increasing number of students due to the shift of focus toward integrating the Internet technologies in education and the focus of academic institutions on moving toward e-Learning, blended, or online learning models. As a result, the topic of student performance prediction has become a vibrant research area in recent years. To address this, machine learning and data mining techniques have emerged as a viable solution. To that end, this work proposes the use of deep learning techniques (CNN and RNN-LSTM) to predict the students' performance at the midpoint stage of the online course delivery using three distinct datasets collected from three different regions of the world. Experimental results show that deep learning models have promising performance as they outperform other optimized traditional ML models in two of the three considered datasets while also having comparable performance for the third dataset.


Scaling Laws for Forgetting When Fine-Tuning Large Language Models

arXiv.org Artificial Intelligence

We study and quantify the problem of forgetting when fine-tuning pre-trained large language models (LLMs) on a downstream task. We find that parameter-efficient fine-tuning (PEFT) strategies, such as Low-Rank Adapters (LoRA), still suffer from catastrophic forgetting. In particular, we identify a strong inverse linear relationship between the fine-tuning performance and the amount of forgetting when fine-tuning LLMs with LoRA. We further obtain precise scaling laws that show forgetting increases as a shifted power law in the number of parameters fine-tuned and the number of update steps. We also examine the impact of forgetting on knowledge, reasoning, and the safety guardrails trained into Llama 2 7B chat. Our study suggests that forgetting cannot be avoided through early stopping or by varying the number of parameters fine-tuned. We believe this opens up an important safety-critical direction for future research to evaluate and develop fine-tuning schemes which mitigate forgetting.


Distribution-Free Conformal Joint Prediction Regions for Neural Marked Temporal Point Processes

arXiv.org Artificial Intelligence

Sequences of labeled events observed at irregular intervals in continuous time are ubiquitous across various fields. Temporal Point Processes (TPPs) provide a mathematical framework for modeling these sequences, enabling inferences such as predicting the arrival time of future events and their associated label, called mark. However, due to model misspecification or lack of training data, these probabilistic models may provide a poor approximation of the true, unknown underlying process, with prediction regions extracted from them being unreliable estimates of the underlying uncertainty. This paper develops more reliable methods for uncertainty quantification in neural TPP models via the framework of conformal prediction. A primary objective is to generate a distribution-free joint prediction region for the arrival time and mark, with a finite-sample marginal coverage guarantee. A key challenge is to handle both a strictly positive, continuous response and a categorical response, without distributional assumptions. We first consider a simple but overly conservative approach that combines individual prediction regions for the event arrival time and mark. Then, we introduce a more effective method based on bivariate highest density regions derived from the joint predictive density of event arrival time and mark. By leveraging the dependencies between these two variables, this method exclude unlikely combinations of the two, resulting in sharper prediction regions while still attaining the pre-specified coverage level. We also explore the generation of individual univariate prediction regions for arrival times and marks through conformal regression and classification techniques. Moreover, we investigate the stronger notion of conditional coverage. Finally, through extensive experimentation on both simulated and real-world datasets, we assess the validity and efficiency of these methods.


Advancing Deep Active Learning & Data Subset Selection: Unifying Principles with Information-Theory Intuitions

arXiv.org Artificial Intelligence

At its core, this thesis aims to enhance the practicality of deep learning by improving the label and training efficiency of deep learning models. To this end, we investigate data subset selection techniques, specifically active learning and active sampling, grounded in information-theoretic principles. Active learning improves label efficiency, while active sampling enhances training efficiency. Supervised deep learning models often require extensive training with labeled data. Label acquisition can be expensive and time-consuming, and training large models is resource-intensive, hindering the adoption outside academic research and ``big tech.'' Existing methods for data subset selection in deep learning often rely on heuristics or lack a principled information-theoretic foundation. In contrast, this thesis examines several objectives for data subset selection and their applications within deep learning, striving for a more principled approach inspired by information theory. We begin by disentangling epistemic and aleatoric uncertainty in single forward-pass deep neural networks, which provides helpful intuitions and insights into different forms of uncertainty and their relevance for data subset selection. We then propose and investigate various approaches for active learning and data subset selection in (Bayesian) deep learning. Finally, we relate various existing and proposed approaches to approximations of information quantities in weight or prediction space. Underpinning this work is a principled and practical notation for information-theoretic quantities that includes both random variables and observed outcomes. This thesis demonstrates the benefits of working from a unified perspective and highlights the potential impact of our contributions to the practical application of deep learning.


Catalyzing Equity in STEM Teams: Harnessing Generative AI for Inclusion and Diversity

arXiv.org Artificial Intelligence

Yiwen Lin, University of California, Irvine Lauren Snow, University of California, Irvine Acknowledgments: This work was partially supported by the National Science Foundation (Grant Number 1535300), and National Institutes of Health (Grant Number 5UC2NS128361-02). Abstract Collaboration is key to STEM, where multidisciplinary team research can solve complex problems. However, inequality in STEM fields hinders their full potential, due to persistent psychological barriers in underrepresented students' experience. This paper documents teamwork in STEM and explores the transformative potential of computational modeling and generative AI in promoting STEM-team diversity and inclusion. Leveraging generative AI, this paper outlines two primary areas for advancing diversity, equity, and inclusion. First, formalizing collaboration assessment with inclusive analytics can capture fine-grained learner behavior. Second, adaptive, personalized AI systems can support diversity and inclusion in STEM teams. Four policy recommendations highlight AI's capacity: formalized collaborative skill assessment, inclusive analytics, funding for socio-cognitive research, human-AI teaming for inclusion training.


Anatomy of Neural Language Models

arXiv.org Artificial Intelligence

Generative AI and transfer learning fields have experienced remarkable advancements in recent years especially in the domain of Natural Language Processing (NLP). Transformers were at the heart of these advancements where the cutting-edge transformer-based Language Models (LMs) enabled new state-of-the-art results in a wide spectrum of applications. While the number of research works involving neural LMs is exponentially increasing, their vast majority are high-level and far from self-contained. Consequently, a deep understanding of the literature in this area is a tough task especially at the absence of a unified mathematical framework explaining the main types of neural LMs. We address the aforementioned problem in this tutorial where the objective is to explain neural LMs in a detailed, simplified and unambiguous mathematical framework accompanied with clear graphical illustrations. Concrete examples on widely used models like BERT and GPT2 are explored. Finally, since transformers pretrained on language-modeling-like tasks have been widely adopted in computer vision and time series applications, we briefly explore some examples of such solutions in order to enable readers understand how transformers work in the aforementioned domains and compare this use with the original one in NLP.


Multi-Source to Multi-Target Decentralized Federated Domain Adaptation

arXiv.org Artificial Intelligence

Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data (called sources) to devices with low quality or unlabeled data (called targets). Our methodology, Source-Target Determination and Link Formation (ST-LF), optimizes both (i) classification of devices into sources and targets and (ii) source-target link formation, in a manner that considers the trade-off between ML model accuracy and communication energy efficiency. To obtain a concrete objective function, we derive a measurable generalization error bound that accounts for estimates of source-target hypothesis deviations and divergences between data distributions. The resulting optimization problem is a mixed-integer signomial program, a class of NP-hard problems, for which we develop an algorithm based on successive convex approximations to solve it tractably. Subsequent numerical evaluations of ST-LF demonstrate that it improves classification accuracy and energy efficiency over state-of-the-art baselines.