Instructional Material
The pop song generator: designing an online course to teach collaborative, creative AI
Yee-king, Matthew, Fiorucci, Andrea, d'Inverno, Mark
This article describes and evaluates a new online AI-creativity course. The course is based around three near-state-of-the-art AI models combined into a pop song generating system. A fine-tuned GPT-2 model writes lyrics, Music-VAE composes musical scores and instrumentation and Diffsinger synthesises a singing voice. We explain the decisions made in designing the course which is based on Piagetian, constructivist 'learning-by-doing'. We present details of the five-week course design with learning objectives, technical concepts, and creative and technical activities. We explain how we overcame technical challenges to build a complete pop song generator system, consisting of Python scripts, pre-trained models, and Javascript code that runs in a dockerised Linux container via a web-based IDE. A quantitative analysis of student activity provides evidence on engagement and a benchmark for future improvements. A qualitative analysis of a workshop with experts validated the overall course design, it suggested the need for a stronger creative brief and ethical and legal content.
FewSAR: A Few-shot SAR Image Classification Benchmark
Zhang, Rui, Wang, Ziqi, Li, Yang, Wang, Jiabao, Wang, Zhiteng
Few-shot learning (FSL) is one of the significant and hard problems in the field of image classification. However, in contrast to the rapid development of the visible light dataset, the progress in SAR target image classification is much slower. The lack of unified benchmark is a key reason for this phenomenon, which may be severely overlooked by the current literature. The researchers of SAR target image classification always report their new results on their own datasets and experimental setup. It leads to inefficiency in result comparison and impedes the further progress of this area. Motivated by this observation, we propose a novel few-shot SAR image classification benchmark (FewSAR) to address this issue. FewSAR consists of an open-source Python code library of 15 classic methods in three categories for few-shot SAR image classification. It provides an accessible and customizable testbed for different few-shot SAR image classification task. To further understanding the performance of different few-shot methods, we establish evaluation protocols and conduct extensive experiments within the benchmark. By analyzing the quantitative results and runtime under the same setting, we observe that the accuracy of metric learning methods can achieve the best results. Meta-learning methods and fine-tuning methods perform poorly on few-shot SAR images, which is primarily due to the bias of existing datasets. We believe that FewSAR will open up a new avenue for future research and development, on real-world challenges at the intersection of SAR image classification and few-shot deep learning. We will provide our code for the proposed FewSAR at https://github.com/solarlee/FewSAR.
SIGHT: A Large Annotated Dataset on Student Insights Gathered from Higher Education Transcripts
Wang, Rose E., Wirawarn, Pawan, Goodman, Noah, Demszky, Dorottya
Lectures are a learning experience for both students and teachers. Students learn from teachers about the subject material, while teachers learn from students about how to refine their instruction. However, online student feedback is unstructured and abundant, making it challenging for teachers to learn and improve. We take a step towards tackling this challenge. First, we contribute a dataset for studying this problem: SIGHT is a large dataset of 288 math lecture transcripts and 15,784 comments collected from the Massachusetts Institute of Technology OpenCourseWare (MIT OCW) YouTube channel. Second, we develop a rubric for categorizing feedback types using qualitative analysis. Qualitative analysis methods are powerful in uncovering domain-specific insights, however they are costly to apply to large data sources. To overcome this challenge, we propose a set of best practices for using large language models (LLMs) to cheaply classify the comments at scale. We observe a striking correlation between the model's and humans' annotation: Categories with consistent human annotations (>$0.9$ inter-rater reliability, IRR) also display higher human-model agreement (>$0.7$), while categories with less consistent human annotations ($0.7$-$0.8$ IRR) correspondingly demonstrate lower human-model agreement ($0.3$-$0.5$). These techniques uncover useful student feedback from thousands of comments, costing around $\$0.002$ per comment. We conclude by discussing exciting future directions on using online student feedback and improving automated annotation techniques for qualitative research.
One-Shot Learning of Visual Path Navigation for Autonomous Vehicles
CuiZhu, Zhongying, Charette, Francois, Ghafourian, Amin, Shi, Debo, Cui, Matthew, Krishnamachar, Anjali, Soltani, Iman
Autonomous driving presents many challenges due to the large number of scenarios the autonomous vehicle (AV) may encounter. End-to-end deep learning models are comparatively simplistic models that can handle a broad set of scenarios. However, end-to-end models require large amounts of diverse data to perform well. This paper presents a novel deep neural network that performs image-to-steering path navigation that helps with the data problem by adding one-shot learning to the system. Presented with a previously unseen path, the vehicle can drive the path autonomously after being shown the path once and without model retraining. In fact, the full path is not needed and images of the road junctions is sufficient. In-vehicle testing and offline testing are used to verify the performance of the proposed navigation and to compare different candidate architectures.
Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning
Bitew, Semere Kiros, Deleu, Johannes, Doฤruรถz, A. Seza, Develder, Chris, Demeester, Thomas
Since performing exercises (including, e.g., practice tests) forms a crucial component of learning, and creating such exercises requires non-trivial effort from the teacher, there is a great value in automatic exercise generation in digital tools in education. In this paper, we particularly focus on automatic creation of gapfilling exercises for language learning, specifically grammar exercises. Since providing any annotation in this domain requires human expert effort, we aim to avoid it entirely and explore the task of converting existing texts into new gap-filling exercises, purely based on an example exercise, without explicit instruction or detailed annotation of the intended grammar topics. We contribute (i) a novel neural network architecture specifically designed for aforementioned gap-filling exercise generation task, and (ii) a real-world benchmark dataset for French grammar. We show that our model for this French grammar gap-filling exercise generation outperforms a competitive baseline classifier by 8% in F1 percentage points, achieving an average F1 score of 82%. Our model implementation and the dataset are made publicly available to foster future research, thus offering a standardized evaluation and baseline solution of the proposed partially annotated data prediction task in grammar exercise creation.
SQL2Circuits: Estimating Metrics for SQL Queries with A Quantum Natural Language Processing Method
Quantum computing has developed significantly in recent years. Developing algorithms to estimate various metrics for SQL queries has been an important research question in database research since the estimations affect query optimization and database performance. This work represents a quantum natural language processing (QNLP) -inspired approach for constructing a quantum machine learning model which can classify SQL queries with respect to their execution times and cardinalities. From the quantum machine learning perspective, we compare our model and results to the previous research in QNLP and conclude that our model reaches similar accuracy as the QNLP model in the classification tasks. This indicates that the QNLP model is a promising method even when applied to problems that are not in QNLP. We study the developed quantum machine learning model by calculating its expressibility and entangling capability histograms. The results show that the model has favorable properties to be expressible but also not too complex to be executed on quantum hardware.
Kalman Filter for Online Classification of Non-Stationary Data
Titsias, Michalis K., Galashov, Alexandre, Rannen-Triki, Amal, Pascanu, Razvan, Teh, Yee Whye, Bornschein, Jorg
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary structure of the data, and with quantification of predictive uncertainty. Motivated by these challenges we introduce a probabilistic Bayesian online learning model by using a (possibly pretrained) neural representation and a state space model over the linear predictor weights. Non-stationarity over the linear predictor weights is modelled using a "parameter drift" transition density, parametrized by a coefficient that quantifies forgetting. Inference in the model is implemented with efficient Kalman filter recursions which track the posterior distribution over the linear weights, while online SGD updates over the transition dynamics coefficient allows to adapt to the non-stationarity seen in data. While the framework is developed assuming a linear Gaussian model, we also extend it to deal with classification problems and for fine-tuning the deep learning representation. In a set of experiments in multi-class classification using data sets such as CIFAR-100 and CLOC we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
Maestro: A Gamified Platform for Teaching AI Robustness
Geleta, Margarita, Xu, Jiacen, Loya, Manikanta, Wang, Junlin, Singh, Sameer, Li, Zhou, Gago-Masague, Sergio
Although the prevention of AI vulnerabilities is critical to preserve the safety and privacy of users and businesses, educational tools for robust AI are still underdeveloped worldwide. We present the design, implementation, and assessment of Maestro. Maestro is an effective open-source game-based platform that contributes to the advancement of robust AI education. Maestro provides goal-based scenarios where college students are exposed to challenging life-inspired assignments in a competitive programming environment. We assessed Maestro's influence on students' engagement, motivation, and learning success in robust AI. This work also provides insights into the design features of online learning tools that promote active learning opportunities in the robust AI domain. We analyzed the reflection responses (measured with Likert scales) of 147 undergraduate students using Maestro in two quarterly college courses in AI. According to the results, students who felt the acquisition of new skills in robust AI tended to appreciate highly Maestro and scored highly on material consolidation, curiosity, and mastery in robust AI. Moreover, the leaderboard, our key gamification element in Maestro, has effectively contributed to students' engagement and learning. Results also indicate that Maestro can be effectively adapted to any course length and depth without losing its educational quality.
Single-Stage Broad Multi-Instance Multi-Label Learning (BMIML) with Diverse Inter-Correlations and its application to medical image classification
Lai, Qi, Zhou, Jianhang, Gan, Yanfen, Vong, Chi-Man, Huang, Deshuang
described by multiple instances (e.g., image patches) and simultaneously associated with multiple labels. Existing MIML methods are useful in many applications but most of which suffer from relatively low accuracy and training efficiency due to several issues: i) the inter-label correlations(i.e., the probabilistic correlations between the multiple labels corresponding to an object) are neglected; ii) the inter-instance correlations (i.e., the probabilistic correlations of different instances in predicting the object label) cannot be learned directly (or jointly) with other types of correlations due to the missing instance labels; iii) diverse inter-correlations (e.g., inter-label correlations, inter-instance correlations) can only be learned in multiple stages. To resolve these issues, a new single-stage framework called broad multi-instance multi-label learning (BMIML) is proposed. In BMIML, there are three innovative modules: i) an auto-weighted label enhancement learning (AWLEL) based on broad learning system (BLS) is designed, which simultaneously and efficiently captures the inter-label correlations while traditional BLS cannot; ii) A specific MIML neural network called scalable multi-instance probabilistic regression (SMIPR) is constructed to effectively estimate the inter-instance correlations using the object label only, which can provide additional probabilistic information for learning; iii) Finally, an interactive decision optimization (IDO) is designed to combine and optimize the results from AWLEL and SMIPR and form a single-stage framework. Experiments show that BMIML is highly competitive to (or even better than) existing methods in accuracy and much faster than most MIML methods even for large medical image data sets (> 90K images).
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning
Guo, Siyuan, Sun, Yanchao, Hu, Jifeng, Huang, Sili, Chen, Hechang, Piao, Haiyin, Sun, Lichao, Chang, Yi
Offline reinforcement learning (RL) provides a promising solution to learning an agent fully relying on a data-driven paradigm. However, constrained by the limited quality of the offline dataset, its performance is often sub-optimal. Therefore, it is desired to further finetune the agent via extra online interactions before deployment. Unfortunately, offline-to-online RL can be challenging due to two main challenges: constrained exploratory behavior and state-action distribution shift. To this end, we propose a Simple Unified uNcertainty-Guided (SUNG) framework, which naturally unifies the solution to both challenges with the tool of uncertainty. Specifically, SUNG quantifies uncertainty via a VAE-based state-action visitation density estimator. To facilitate efficient exploration, SUNG presents a practical optimistic exploration strategy to select informative actions with both high value and high uncertainty. Moreover, SUNG develops an adaptive exploitation method by applying conservative offline RL objectives to high-uncertainty samples and standard online RL objectives to low-uncertainty samples to smoothly bridge offline and online stages. SUNG achieves state-of-the-art online finetuning performance when combined with different offline RL methods, across various environments and datasets in D4RL benchmark.