Goto

Collaborating Authors

 Instructional Material


Performance of Large Language Models in a Computer Science Degree Program

arXiv.org Artificial Intelligence

Large language models such as ChatGPT-3.5 and GPT-4.0 are ubiquitous and dominate the current discourse. Their transformative capabilities have led to a paradigm shift in how we interact with and utilize (text-based) information. Each day, new possibilities to leverage the capabilities of these models emerge. This paper presents findings on the performance of different large language models in a university of applied sciences' undergraduate computer science degree program. Our primary objective is to assess the effectiveness of these models within the curriculum by employing them as educational aids. By prompting the models with lecture material, exercise tasks, and past exams, we aim to evaluate their proficiency across different computer science domains. We showcase the strong performance of current large language models while highlighting limitations and constraints within the context of such a degree program. We found that ChatGPT-3.5 averaged 79.9% of the total score in 10 tested modules, BingAI achieved 68.4%, and LLaMa, in the 65 billion parameter variant, 20%. Despite these convincing results, even GPT-4.0 would not pass the degree program - due to limitations in mathematical calculations.


Advancing Robot Autonomy for Long-Horizon Tasks

arXiv.org Artificial Intelligence

Autonomous robots have real-world applications in diverse fields, such as mobile manipulation and environmental exploration, and many such tasks benefit from a hands-off approach in terms of human user involvement over a long task horizon. However, the level of autonomy achievable by a deployment is limited in part by the problem definition or task specification required by the system. Task specifications often require technical, low-level information that is unintuitive to describe and may result in generic solutions, burdening the user technically both before and after task completion. In this thesis, we aim to advance task specification abstraction toward the goal of increasing robot autonomy in real-world scenarios. We do so by tackling problems that address several different angles of this goal. First, we develop a way for the automatic discovery of optimal transition points between subtasks in the context of constrained mobile manipulation, removing the need for the human to hand-specify these in the task specification. We further propose a way to automatically describe constraints on robot motion by using demonstrated data as opposed to manually-defined constraints. Then, within the context of environmental exploration, we propose a flexible task specification framework, requiring just a set of quantiles of interest from the user that allows the robot to directly suggest locations in the environment for the user to study. We next systematically study the effect of including a robot team in the task specification and show that multirobot teams have the ability to improve performance under certain specification conditions, including enabling inter-robot communication. Finally, we propose methods for a communication protocol that autonomously selects useful but limited information to share with the other robots.


Compact & Capable: Harnessing Graph Neural Networks and Edge Convolution for Medical Image Classification

arXiv.org Artificial Intelligence

Graph-based neural network models are gaining traction in the field of representation learning due to their ability to uncover latent topological relationships between entities that are otherwise challenging to identify. These models have been employed across a diverse range of domains, encompassing drug discovery, protein interactions, semantic segmentation, and fluid dynamics research. In this study, we investigate the potential of Graph Neural Networks (GNNs) for medical image classification. We introduce a novel model that combines GNNs and edge convolution, leveraging the interconnectedness of RGB channel feature values to strongly represent connections between crucial graph nodes. Our proposed model not only performs on par with state-of-the-art Deep Neural Networks (DNNs) but does so with 1000 times fewer parameters, resulting in reduced training time and data requirements. We compare our Graph Convolutional Neural Network (GCNN) to pre-trained DNNs for classifying MedMNIST dataset classes, revealing promising prospects for GNNs in medical image analysis. Our results also encourage further exploration of advanced graph-based models such as Graph Attention Networks (GAT) and Graph Auto-Encoders in the medical imaging domain. The proposed model yields more reliable, interpretable, and accurate outcomes for tasks like semantic segmentation and image classification compared to simpler GCNNs


Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics

arXiv.org Artificial Intelligence

Keyword Spotting (KWS) models on embedded devices should adapt fast to new user-defined words without forgetting previous ones. Embedded devices have limited storage and computational resources, thus, they cannot save samples or update large models. We consider the setup of embedded online continual learning (EOCL), where KWS models with frozen backbone are trained to incrementally recognize new words from a non-repeated stream of samples, seen one at a time. To this end, we propose Temporal Aware Pooling (TAP) which constructs an enriched feature space computing high-order moments of speech features extracted by a pre-trained backbone. Our method, TAP-SLDA, updates a Gaussian model for each class on the enriched feature space to effectively use audio representations. In experimental analyses, TAP-SLDA outperforms competitors on several setups, backbones, and baselines, bringing a relative average gain of 11.3% on the GSC dataset.


EASpace: Enhanced Action Space for Policy Transfer

arXiv.org Artificial Intelligence

Formulating expert policies as macro actions promises to alleviate the long-horizon issue via structured exploration and efficient credit assignment. However, traditional option-based multi-policy transfer methods suffer from inefficient exploration of macro action's length and insufficient exploitation of useful long-duration macro actions. In this paper, a novel algorithm named EASpace (Enhanced Action Space) is proposed, which formulates macro actions in an alternative form to accelerate the learning process using multiple available sub-optimal expert policies. Specifically, EASpace formulates each expert policy into multiple macro actions with different execution {times}. All the macro actions are then integrated into the primitive action space directly. An intrinsic reward, which is proportional to the execution time of macro actions, is introduced to encourage the exploitation of useful macro actions. The corresponding learning rule that is similar to Intra-option Q-learning is employed to improve the data efficiency. Theoretical analysis is presented to show the convergence of the proposed learning rule. The efficiency of EASpace is illustrated by a grid-based game and a multi-agent pursuit problem. The proposed algorithm is also implemented in physical systems to validate its effectiveness.


A Hybrid Evolutionary Approach to Solve University Course Allocation Problem

arXiv.org Artificial Intelligence

This paper discusses various types of constraints, difficulties and solutions to overcome the challenges regarding university course allocation problem. A hybrid evolutionary algorithm has been defined combining Local Repair Algorithm and Modified Genetic Algorithm to generate the best course assignment. After analyzing the collected dataset, all the necessary constraints were formulated. These constraints manage to cover the aspects needed to be kept in mind while preparing clash free and efficient class schedules for every faculty member. The goal is to generate an optimized solution which will fulfill those constraints while maintaining time efficiency and also reduce the workload of handling this task manually. The proposed algorithm was compared with some base level optimization algorithms to show the better efficiency in terms of accuracy and time.


An Analysis of Programming Course Evaluations Before and After the Introduction of an Autograder

arXiv.org Artificial Intelligence

Commonly, introductory programming courses in higher education institutions have hundreds of participating students eager to learn to program. The manual effort for reviewing the submitted source code and for providing feedback can no longer be managed. Manually reviewing the submitted homework can be subjective and unfair, particularly if many tutors are responsible for grading. Different autograders can help in this situation; however, there is a lack of knowledge about how autograders can impact students' overall perception of programming classes and teaching. This is relevant for course organizers and institutions to keep their programming courses attractive while coping with increasing students. This paper studies the answers to the standardized university evaluation questionnaires of multiple large-scale foundational computer science courses which recently introduced autograding. The differences before and after this intervention are analyzed. By incorporating additional observations, we hypothesize how the autograder might have contributed to the significant changes in the data, such as, improved interactions between tutors and students, improved overall course quality, improved learning success, increased time spent, and reduced difficulty. This qualitative study aims to provide hypotheses for future research to define and conduct quantitative surveys and data analysis. The autograder technology can be validated as a teaching method to improve student satisfaction with programming courses.


Moderately Supervised Learning: Definition, Framework and Generality

arXiv.org Artificial Intelligence

Learning with supervision has achieved remarkable success in numerous artificial intelligence (AI) applications. In the current literature, by referring to the properties of the labels prepared for the training dataset, learning with supervision is categorized as supervised learning (SL) and weakly supervised learning (WSL). SL concerns the situation where the training data set is assigned with ideal (complete, exact and accurate) labels, while WSL concerns the situation where the training data set is assigned with non-ideal (incomplete, inexact or inaccurate) labels. However, various solutions for SL tasks have shown that the given labels are not always easy to learn, and the transformation from the given labels to easy-to-learn targets can significantly affect the performance of the final SL solutions. Without considering the properties of the transformation from the given labels to easy-to-learn targets, the definition of SL conceals some details that can be critical to building the appropriate solutions for specific SL tasks. Thus, for engineers in the AI application field, it is desirable to reveal these details systematically. This article attempts to achieve this goal by expanding the categorization of SL and investigating the sub-type moderately supervised learning (MSL) that concerns the situation where the given labels are ideal, but due to the simplicity in annotation, careful designs are required to transform the given labels into easy-to-learn targets. From the perspectives of the definition, framework and generality, we conceptualize MSL to present a complete fundamental basis to systematically analyse MSL tasks. At meantime, revealing the relation between the conceptualization of MSL and the mathematicians' vision, this paper as well establishes a tutorial for AI application engineers to refer to viewing a problem to be solved from the mathematicians' vision.


Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review

arXiv.org Artificial Intelligence

Advancements in generative artificial intelligence (AI) and large language models (LLMs) have fueled the development of many educational technology innovations that aim to automate the often time-consuming and laborious tasks of generating and analysing textual content (e.g., generating open-ended questions and analysing student feedback survey) (Kasneci et al., 2023; Wollny et al., 2021; Leiker et al., 2023). LLMs are generative artificial intelligence models that have been trained on an extensive amount of text data, capable of generating human-like text content based on natural language inputs. Specifically, these LLMs, such as Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) and Generative Pre-trained Transformer (GPT) (Brown et al., 2020), utilise deep learning and self-attention mechanisms (Vaswani et al., 2017) to selectively attend to the different parts of input texts, depending on the focus of the current tasks, allowing the model to learn complex patterns and relationships among textual contents, such as their semantic, contextual, and syntactic relationships (Min et al., 2021; Liu et al., 2023). As several LLMs (e.g., GPT-3 and Codex) have been pre-trained on massive amounts of data across multiple disciplines, they are capable of completing natural language processing tasks with little (few-shot learning) or no additional training (zero-shot learning) (Brown et al., 2020; Wu et al., 2023). This could lower the technological barriers to LLMs-based innovations as researchers and practitioners can develop new educational technologies by fine-tuning LLMs on specific educational tasks without starting from scratch (Caines et al., 2023; Sridhar et al., 2023). The recent release of ChatGPT, an LLMs-based generative AI chatbot that requires only natural language prompts without additional model training or fine-tuning (OpenAI, 2023), has further lowered the barrier for individuals without technological background to leverage the generative powers of LLMs. Although educational research that leverages LLMs to develop technological innovations for automating educational tasks is yet to achieve its full potential (i.e., most works have focused on improving model performances (Kurdi et al., 2020; Ramesh and Sanampudi, 2022)), a growing body of literature hints at how different stakeholders could potentially benefit from such innovations.


How to Design and Deliver Courses for Higher Education in the AI Era: Insights from Exam Data Analysis

arXiv.org Artificial Intelligence

In this position paper, we advocate for the idea that courses and exams in the AI era have to be designed based on two factors: (1) the strengths and limitations of AI, and (2) the pedagogical educational objectives. Based on insights from the Delors report on education [1], we first address the role of education and recall the main objectives that educational institutes must strive to achieve independently of any technology. We then explore the strengths and limitations of AI, based on current advances in AI. We explain how courses and exams can be designed based on these strengths and limitations of AI, providing different examples in the IT, English, and Art domains. We show how we adopted a pedagogical approach that is inspired from the Socratic teaching method from January 2023 to May 2023. Then, we present the data analysis results of seven ChatGPT-authorized exams conducted between December 2022 and March 2023. Our exam data results show that there is no correlation between students' grades and whether or not they use ChatGPT to answer their exam questions. Finally, we present a new exam system that allows us to apply our pedagogical approach in the AI era.