Goto

Collaborating Authors

 Instructional Material


The NeurIPS 2023 Machine Learning for Audio Workshop: Affective Audio Benchmarks and Novel Data

arXiv.org Artificial Intelligence

The NeurIPS 2023 Machine Learning for Audio Workshop brings together machine learning (ML) experts from various audio domains. There are several valuable audio-driven ML tasks, from speech emotion recognition to audio event detection, but the community is sparse compared to other ML areas, e.g., computer vision or natural language processing. A major limitation with audio is the available data; with audio being a time-dependent modality, high-quality data collection is time-consuming and costly, making it challenging for academic groups to apply their often state-of-the-art strategies to a larger, more generalizable dataset. In this short white paper, to encourage researchers with limited access to large-datasets, the organizers first outline several open-source datasets that are available to the community, and for the duration of the workshop are making several propriety datasets available. Namely, three vocal datasets, Hume-Prosody, Hume-VocalBurst, an acted emotional speech dataset Modulate-Sonata, and an in-game streamer dataset Modulate-Stream. We outline the current baselines on these datasets but encourage researchers from across audio to utilize them outside of the initial baseline tasks.


Generative AI and CS Education

Communications of the ACM

I have spent most of my career working on computer science (CS) education whether teaching undergraduate CS or managing technical education for software engineers at Google. In the early 1990s, when Pascal was the language of choice, I began teaching CS1 and CS2 at Stanford. Over the next few years, I saw the transition from Pascal to C to object-oriented programming. I also saw the pace at which we had to consistently update our course materials and projects, whether it was in the introductory courses or later electives such as graphics or compilers. Languages, software frameworks, libraries, APIs, and so forth change rapidly.


Analyzing the Impact of Partial Sharing on the Resilience of Online Federated Learning Against Model Poisoning Attacks

arXiv.org Artificial Intelligence

We scrutinize the resilience of the partial-sharing online federated learning (PSO-Fed) algorithm against model-poisoning attacks. PSO-Fed reduces the communication load by enabling clients to exchange only a fraction of their model estimates with the server at each update round. Partial sharing of model estimates also enhances the robustness of the algorithm against model-poisoning attacks. To gain better insights into this phenomenon, we analyze the performance of the PSO-Fed algorithm in the presence of Byzantine clients, malicious actors who may subtly tamper with their local models by adding noise before sharing them with the server. Through our analysis, we demonstrate that PSO-Fed maintains convergence in both mean and mean-square senses, even under the strain of model-poisoning attacks. We further derive the theoretical mean square error (MSE) of PSO-Fed, linking it to various parameters such as stepsize, attack probability, number of Byzantine clients, client participation rate, partial-sharing ratio, and noise variance. We also show that there is a non-trivial optimal stepsize for PSO-Fed when faced with model-poisoning attacks. The results of our extensive numerical experiments affirm our theoretical assertions and highlight the superior ability of PSO-Fed to counteract Byzantine attacks, outperforming other related leading algorithms.


Looking for the Human in HRI Teaching: User-Centered Course Design for Tech-Savvy Students

arXiv.org Artificial Intelligence

Top-down, user-centered thinking is not typically a strength of all students, especially tech-savvy computer science-related ones. We propose Human-Robot Interaction (HRI) introductory courses as a highly suitable opportunity to foster these important skills since the HRI discipline includes a focus on humans as users. Our HRI course therefore contains elements like scenario-based design of laboratory projects, discussing and merging ideas and other self-empowerment techniques. Participants describe, implement and present everyday scenarios using Pepper robots and our customized open-source visual programming tool. We observe that students obtain a good grasp of the taught topics and improve their user-centered thinking skills.


Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It

arXiv.org Artificial Intelligence

Label smoothing (LS) is a popular regularisation method for training deep neural network classifiers due to its effectiveness in improving test accuracy and its simplicity in implementation. "Hard" one-hot labels are "smoothed" by uniformly distributing probability mass to other classes, reducing overfitting. In this work, we reveal that LS negatively affects selective classification (SC) - where the aim is to reject misclassifications using a model's predictive uncertainty. We first demonstrate empirically across a range of tasks and architectures that LS leads to a consistent degradation in SC. We then explain this by analysing logit-level gradients, showing that LS exacerbates overconfidence and underconfidence by regularising the max logit more when the probability of error is low, and less when the probability of error is high. This elucidates previously reported experimental results where strong classifiers underperform in SC. We then demonstrate the empirical effectiveness of logit normalisation for recovering lost SC performance caused by LS. Furthermore, based on our gradient analysis, we explain why such normalisation is effective. We will release our code shortly.


Language Evolution with Deep Learning

arXiv.org Artificial Intelligence

Social animals have been found to use some means of communication to coordinate in various contexts: foraging for food, avoiding predators, mating, etc. (Hauser, 1996). Among animals, however, humans seem to be unique in having developed a communication system, natural language, that transcends these basic needs and can represent an infinite variety of new situations (Hauser et al., 2002) to the extent that language itself becomes the basis for a new form of evolution: cultural evolution. Understanding the emergence of this unique human ability has always been a vexing scientific problem due to the lack of access to the communication systems of intermediate steps of hominid evolution (Harnad et al., 1976; Bickerton, 2007). In the absence of data, a tempting idea has been to reproduce experimentally the process of language emergence in either humans or computational models (Steels, 1997; Myers-Scotton, 2002; Kirby, 2002). Experimental paradigms with humans (Kirby et al., 2008; Raviv et al., 2019; Motamedi et al., 2019) have produced significant insights into language evolution. Still, their scope is limited due to the inability to replicate key aspects of language evolution, such as communication within and across large populations and the study of long evolutionary timescales. Computer modeling can help overcome these limitations and has played a prominent role in studying language evolution for a long time (Lieberman and Crelin, 1971).


Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning

arXiv.org Artificial Intelligence

Class-Incremental Learning (CIL) requires a learning system to continually learn new classes without forgetting. Despite the strong performance of Pre-Trained Models (PTMs) in CIL, a critical issue persists: learning new classes often results in the overwriting of old ones. Excessive modification of the network causes forgetting, while minimal adjustments lead to an inadequate fit for new classes. As a result, it is desired to figure out a way of efficient model updating without harming former knowledge. In this paper, we propose ExpAndable Subspace Ensemble (EASE) for PTM-based CIL. To enable model updating without conflict, we train a distinct lightweight adapter module for each new task, aiming to create task-specific subspaces. These adapters span a high-dimensional feature space, enabling joint decision-making across multiple subspaces. As data evolves, the expanding subspaces render the old class classifiers incompatible with new-stage spaces. Correspondingly, we design a semantic-guided prototype complement strategy that synthesizes old classes' new features without using any old class instance. Extensive experiments on seven benchmark datasets verify EASE's state-of-the-art performance. Code is available at: https://github.com/sun-hailong/CVPR24-Ease


Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching

arXiv.org Artificial Intelligence

Feedback is a critical aspect of improvement. Unfortunately, when there is a lot of feedback from multiple sources, it can be difficult to distill the information into actionable insights. Consider student evaluations of teaching (SETs), which are important sources of feedback for educators. They can give instructors insights into what worked during a semester. A collection of SETs can also be useful to administrators as signals for courses or entire programs. However, on a large scale as in high-enrollment courses or administrative records over several years, the volume of SETs can render them difficult to analyze. In this paper, we discuss a novel method for analyzing SETs using natural language processing (NLP) and large language models (LLMs). We demonstrate the method by applying it to a corpus of 5,000 SETs from a large public university. We show that the method can be used to extract, embed, cluster, and summarize the SETs to identify the themes they express. More generally, this work illustrates how to use the combination of NLP techniques and LLMs to generate a codebook for SETs. We conclude by discussing the implications of this method for analyzing SETs and other types of student writing in teaching and research settings.


HRI in Indian Education: Challenges Opportunities

arXiv.org Artificial Intelligence

With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field.


Embracing the Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT

arXiv.org Artificial Intelligence

The rapid advancement of generative Artificial Intelligence (AI) technologies, particularly Generative Pre-trained Transformer (GPT) models such as ChatGPT, has the potential to significantly impact cybersecurity. In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry. Our research highlighted the importance of understanding the alignment between GPT's ``mental model'' and human cognition, as well as the enhancement of GPT capabilities to human skills based on Bloom's taxonomy. By analyzing current educational practices and the alignment of curricula with industry requirements, we concluded that universities providing practical degrees like cybersecurity should align closely with industry demand and embrace the inevitable generative AI revolution, while applying stringent ethics oversight to safeguard responsible GPT usage. We proposed a set of recommendations focused on updating university curricula, promoting agility within universities, fostering collaboration between academia, industry, and policymakers, and evaluating and assessing educational outcomes.