Instructional Material
The Robots are Here: Navigating the Generative AI Revolution in Computing Education
Prather, James, Denny, Paul, Leinonen, Juho, Becker, Brett A., Albluwi, Ibrahim, Craig, Michelle, Keuning, Hieke, Kiesler, Natalie, Kohn, Tobias, Luxton-Reilly, Andrew, MacNeil, Stephen, Peterson, Andrew, Pettit, Raymond, Reeves, Brent N., Savelka, Jaromir
Recent advancements in artificial intelligence (AI) are fundamentally reshaping computing, with large language models (LLMs) now effectively being able to generate and interpret source code and natural language instructions. These emergent capabilities have sparked urgent questions in the computing education community around how educators should adapt their pedagogy to address the challenges and to leverage the opportunities presented by this new technology. In this working group report, we undertake a comprehensive exploration of LLMs in the context of computing education and make five significant contributions. First, we provide a detailed review of the literature on LLMs in computing education and synthesise findings from 71 primary articles. Second, we report the findings of a survey of computing students and instructors from across 20 countries, capturing prevailing attitudes towards LLMs and their use in computing education contexts. Third, to understand how pedagogy is already changing, we offer insights collected from in-depth interviews with 22 computing educators from five continents who have already adapted their curricula and assessments. Fourth, we use the ACM Code of Ethics to frame a discussion of ethical issues raised by the use of large language models in computing education, and we provide concrete advice for policy makers, educators, and students. Finally, we benchmark the performance of LLMs on various computing education datasets, and highlight the extent to which the capabilities of current models are rapidly improving. Our aim is that this report will serve as a focal point for both researchers and practitioners who are exploring, adapting, using, and evaluating LLMs and LLM-based tools in computing classrooms.
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Li, Juntao, Tang, Zecheng, Ding, Yuyang, Wang, Pinzheng, Guo, Pei, You, Wangjie, Qiao, Dan, Chen, Wenliang, Fu, Guohong, Zhu, Qiaoming, Zhou, Guodong, Zhang, Min
Large language models (LLMs) with billions of parameters have demonstrated outstanding performance on various natural language processing tasks. This report presents OpenBA, an open-sourced 15B bilingual asymmetric seq2seq model, to contribute an LLM variant to the Chinese-oriented open-source model community. We enhance OpenBA with effective and efficient techniques as well as adopt a three-stage training strategy to train the model from scratch. Our solution can also achieve very competitive performance with only 380B tokens, which is better than LLaMA-70B on the BELEBELE benchmark, BLOOM-176B on the MMLU benchmark, GLM-130B on the C-Eval (hard) benchmark. This report provides the main details to pre-train an analogous model, including pre-training data processing, Bilingual Flan data collection, the empirical observations that inspire our model architecture design, training objectives of different stages, and other enhancement techniques. Additionally, we also provide the fine-tuning details of OpenBA on four downstream tasks. We have refactored our code to follow the design principles of the Huggingface Transformers Library, making it more convenient for developers to use, and released checkpoints of different training stages at https://huggingface.co/openBA.
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
Huang, Yupan, Meng, Zaiqiao, Liu, Fangyu, Su, Yixuan, Collier, Nigel, Lu, Yutong
Large language models exhibit enhanced zero-shot performance on various tasks when fine-tuned with instruction-following data. Multimodal instruction-following models extend these capabilities by integrating both text and images. However, existing models such as MiniGPT-4 face challenges in maintaining dialogue coherence in scenarios involving multiple images. A primary reason is the lack of a specialized dataset for this critical application. To bridge these gaps, we present SparklesChat, a multimodal instruction-following model for open-ended dialogues across multiple images. To support the training, we introduce SparklesDialogue, the first machine-generated dialogue dataset tailored for word-level interleaved multi-image and text interactions. Furthermore, we construct SparklesEval, a GPT-assisted benchmark for quantitatively assessing a model's conversational competence across multiple images and dialogue turns. Our experiments validate the effectiveness of SparklesChat in understanding and reasoning across multiple images and dialogue turns. Specifically, SparklesChat outperformed MiniGPT-4 on established vision-and-language benchmarks, including the BISON binary image selection task and the NLVR2 visual reasoning task. Moreover, SparklesChat scored 8.56 out of 10 on SparklesEval, substantially exceeding MiniGPT-4's score of 3.91 and nearing GPT-4's score of 9.26. Qualitative evaluations further demonstrate SparklesChat's generality in handling real-world applications. All resources are available at https://github.com/HYPJUDY/Sparkles.
Efficient Planning with Latent Diffusion
Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning, mainly when dealing with domains that involve temporally extended tasks and delayed sparse rewards. Existing methods typically plan in the raw action space and can be inefficient and inflexible. Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support and decoupling the temporal structure between planning and modeling. However, current latent-action-based methods are limited to discrete spaces and require expensive planning. This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models. We establish the theoretical equivalence between planning in the latent action space and energy-guided sampling with a pretrained diffusion model and incorporate a novel sequence-level exact sampling method. Our proposed method, $\texttt{LatentDiffuser}$, demonstrates competitive performance on low-dimensional locomotion control tasks and surpasses existing methods in higher-dimensional tasks.
"With Great Power Comes Great Responsibility!": Student and Instructor Perspectives on the influence of LLMs on Undergraduate Engineering Education
Joshi, Ishika, Budhiraja, Ritvik, Tanna, Pranav Deepak, Jain, Lovenya, Deshpande, Mihika, Srivastava, Arjun, Rallapalli, Srinivas, Akolekar, Harshal D, Challa, Jagat Sesh, Kumar, Dhruv
The rise in popularity of Large Language Models (LLMs) has prompted discussions in academic circles, with students exploring LLM-based tools for coursework inquiries and instructors exploring them for teaching and research. Even though a lot of work is underway to create LLM-based tools tailored for students and instructors, there is a lack of comprehensive user studies that capture the perspectives of students and instructors regarding LLMs. This paper addresses this gap by conducting surveys and interviews within undergraduate engineering universities in India. Using 1306 survey responses among students, 112 student interviews, and 27 instructor interviews around the academic usage of ChatGPT (a popular LLM), this paper offers insights into the current usage patterns, perceived benefits, threats, and challenges, as well as recommendations for enhancing the adoption of LLMs among students and instructors. These insights are further utilized to discuss the practical implications of LLMs in undergraduate engineering education and beyond.
STAR: Improving Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
Ma, Mingyu Derek, Wang, Xiaoxuan, Kung, Po-Nien, Brantingham, P. Jeffrey, Peng, Nanyun, Wang, Wei
Information extraction tasks such as event extraction require an in-depth understanding of the output structure and sub-task dependencies. They heavily rely on task-specific training data in the form of (passage, target structure) pairs to obtain reasonable performance. However, obtaining such data through human annotation is costly, leading to a pressing need for low-resource information extraction approaches that require minimal human labeling for real-world applications. Fine-tuning supervised models with synthesized training data would be a generalizable method, but the existing data generation methods either still rely on large-scale ground-truth data or cannot be applied to complicated IE tasks due to their poor performance. To address these challenges, we propose STAR, a data generation method that leverages Large Language Models (LLMs) to synthesize data instances given limited seed demonstrations, thereby boosting low-resource information extraction performance. Our approach involves generating target structures (Y) followed by generating passages (X), all accomplished with the aid of LLMs. We design fine-grained step-by-step instructions to obtain the initial data instances. We further reduce errors and improve data quality through self-reflection error identification and self-refinement with iterative revision. Our experiments show that the data generated by STAR significantly improves the performance of low-resource event extraction and relation extraction tasks, even surpassing the effectiveness of human-curated data. Human assessment of the data quality shows STAR-generated data exhibits higher passage quality and better align with the task definitions compared with the human-curated data.
1st European Summer School on Artificial Intelligence (ESSAI) & 20th Advanced Course on Artificial Intelligence (ACAI) , Ljubljana 2023
The European Summer School on Artificial Intelligence (ESSAI) is a direct product of European AI research being increasingly coordinated and scaled up across projects, research organisations and countries. ESSAI's immediate predecessors are the Advanced Course on AI (ACAI), organised since 1985 under the auspices of the European Association for Artificial Intelligence (EurAI), and the TAILOR Summer School on Trustworthy AI organised since 2021 by the European ICT-48 Network of Excellence on Trustworthy AI through Integrating Learning, Optimisation and Reasoning. Last year, these two schools were already co-located in Barcelona with two parallel tracks as well as joint events.
How to Model Brushless Electric Motors for the Design of Lightweight Robotic Systems
Lee, Ung Hee, Shepherd, Tor, Kim, Sangbae, De, Avik, Su, Hao, Gregg, Robert, Mooney, Luke, Rouse, Elliott
A key step in the development of lightweight, high performance robotic systems is the modeling and selection of permanent magnet brushless direct current (BLDC) electric motors. Typical modeling analyses are completed a priori, and provide insight for properly sizing a motor for an application, specifying the required operating voltage and current, as well as assessing the thermal response and other design attributes (e.g.transmission ratio). However, to perform these modeling analyses, proper information about the motor's characteristics are needed, which are often obtained from manufacturer datasheets. Through our own experience and communications with manufacturers, we have noticed a lack of clarity and standardization in modeling BLDC motors, compounded by vague or inconsistent terminology used in motor datasheets. The purpose of this tutorial is to concisely describe the governing equations for BLDC motor analyses used in the design process, as well as highlight potential errors that can arise from incorrect usage. We present a power-invariant conversion from phase and line-to-line reference frames to a familiar q-axis DC motor representation, which provides a ``brushed'' analogue of a three phase BLDC motor that is convenient for analysis and design. We highlight potential errors including incorrect calculations of winding resistive heat loss, improper estimation of motor torque via the motor's torque constant, and incorrect estimation of the required bus voltage or resulting angular velocity limitations. A unified and condensed set of governing equations is available for designers in the Appendix. The intent of this work is to provide a consolidated mathematical foundation for modeling BLDC motors that addresses existing confusion and fosters high performance designs of future robotic systems.
GAIA-1: A Generative World Model for Autonomous Driving
Hu, Anthony, Russell, Lloyd, Yeo, Hudson, Murez, Zak, Fedoseev, George, Kendall, Alex, Shotton, Jamie, Corrado, Gianluca
Autonomous driving promises transformative improvements to transportation, but building systems capable of safely navigating the unstructured complexity of real-world scenarios remains challenging. A critical problem lies in effectively predicting the various potential outcomes that may emerge in response to the vehicle's actions as the world evolves. To address this challenge, we introduce GAIA-1 ('Generative AI for Autonomy'), a generative world model that leverages video, text, and action inputs to generate realistic driving scenarios while offering fine-grained control over ego-vehicle behavior and scene features. Our approach casts world modeling as an unsupervised sequence modeling problem by mapping the inputs to discrete tokens, and predicting the next token in the sequence. Emerging properties from our model include learning high-level structures and scene dynamics, contextual awareness, generalization, and understanding of geometry. The power of GAIA-1's learned representation that captures expectations of future events, combined with its ability to generate realistic samples, provides new possibilities for innovation in the field of autonomy, enabling enhanced and accelerated training of autonomous driving technology.
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness
Wen, Xiaoyu, Yu, Xudong, Yang, Rui, Bai, Chenjia, Wang, Zhen
To obtain a near-optimal policy with fewer interactions in Reinforcement Learning (RL), a promising approach involves the combination of offline RL, which enhances sample efficiency by leveraging offline datasets, and online RL, which explores informative transitions by interacting with the environment. Offline-to-Online (O2O) RL provides a paradigm for improving an offline trained agent within limited online interactions. However, due to the significant distribution shift between online experiences and offline data, most offline RL algorithms suffer from performance drops and fail to achieve stable policy improvement in O2O adaptation. To address this problem, we propose the Robust Offline-to-Online (RO2O) algorithm, designed to enhance offline policies through uncertainty and smoothness, and to mitigate the performance drop in online adaptation. Specifically, RO2O incorporates Q-ensemble for uncertainty penalty and adversarial samples for policy and value smoothness, which enable RO2O to maintain a consistent learning procedure in online adaptation without requiring special changes to the learning objective. Theoretical analyses in linear MDPs demonstrate that the uncertainty and smoothness lead to a tighter optimality bound in O2O against distribution shift. Experimental results illustrate the superiority of RO2O in facilitating stable offline-to-online learning and achieving significant improvement with limited online interactions.