Instructional Material
Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation
Jacobs, Sven, Jaschke, Steffen
This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, corresponding lecture recordings were transcribed and made available to the Large Language Model GPT-4 as external knowledge source together with timestamps as metainformation by using RAG. The purpose of this is to prevent hallucinations and to enforce the use of the technical terms and phrases from the lecture. In an exercise platform developed to solve programming problems for an introductory programming lecture, students can request feedback on their solutions generated by GPT-4. For this task GPT-4 receives the students' code solution, the compiler output, the result of unit tests and the relevant passages from the lecture notes available through the use of RAG as additional context. The feedback generated by GPT-4 should guide students to solve problems independently and link to the lecture content, using the time stamps of the transcript as meta-information. In this way, the corresponding lecture videos can be viewed immediately at the corresponding positions. For the evaluation, students worked with the tool in a workshop and decided for each feedback whether it should be extended by RAG or not. First results based on a questionnaire and the collected usage data show that the use of RAG can improve feedback generation and is preferred by students in some situations. Due to the slower speed of feedback generation, the benefits are situation dependent.
Modern Information Technologies in Scientific Research and Educational Activities
Malakhov, Kyrylo, Kaverinskiy, Vadislav, Ivanova, Liliia, Romanyuk, Oleksandr, Romaniuk, Oksana, Voinova, Svitlana, Kotlyk, Sergii, Sokolova, Oksana
Nowadays, there is a rapid development of information technology, which entails the need to constantly improve and expand the capabilities of interactive artificial intelligence systems This monograph combines several current topics related to the field of information technology One of the key topics is the methodology for enhancing the capabilities of conversational systems, with a focus on ChatGPT, which represents the latest advance in the field of artificial intelligence The monograph also discusses text generation systems based on ontological representations, which open up wide opportunities for creating high-quality content A special place in the work is given to an automated computer system for diagnosing the competitiveness of specialists in the field of information technology This helps to effectively assess the professionalism of specialists and determine the need for advanced training Theoretical aspects of correct color rendering and informatization of educational and research work of graduate students are important in ensuring the quality of education and scientific research And finally, the use of technology for creating 3D models has become an integral part of the modern information environment, which makes it possible to bring the most daring ideas and projects to life Research and development in these areas contribute to the improvement of information technologies, finding application in various fields of activity The purpose of our monograph is to conduct analysis and research in these areas in order to promote the development of information technologies and increase their efficiency The monograph was compiled based on the results of the XVI international scientific and practical conference "Information technologies and automation -- 2023", which took place in October 2023 at Odessa National University of Technology
GAD: A Real-time Gait Anomaly Detection System with Online Adaptive Learning
Lee, Ming-Chang, Lin, Jia-Chun, Katsikas, Sokratis
Gait anomaly detection is a task that involves detecting deviations from a person's normal gait pattern. These deviations can indicate health issues and medical conditions in the healthcare domain, or fraudulent impersonation and unauthorized identity access in the security domain. A number of gait anomaly detection approaches have been introduced, but many of them require offline data preprocessing, offline model learning, setting parameters, and so on, which might restrict their effectiveness and applicability in real-world scenarios. To address these issues, this paper introduces GAD, a real-time gait anomaly detection system. GAD focuses on detecting anomalies within an individual's three-dimensional accelerometer readings based on dimensionality reduction and Long Short-Term Memory (LSTM). Upon being launched, GAD begins collecting a gait segment from the user and training an anomaly detector to learn the user's walking pattern on the fly. If the subsequent model verification is successful, which involves validating the trained detector using the user's subsequent steps, the detector is employed to identify abnormalities in the user's subsequent gait readings at the user's request. The anomaly detector will be retained online to adapt to minor pattern changes and will undergo retraining as long as it cannot provide adequate prediction. We explored two methods for capturing users' gait segments: a personalized method tailored to each individual's step length, and a uniform method utilizing a fixed step length. Experimental results using an open-source gait dataset show that GAD achieves a higher detection accuracy ratio when combined with the personalized method.
Towards An Online Incremental Approach to Predict Students Performance
Analytical models developed in offline settings with pre-prepared data are typically used to predict students' performance. However, when data are available over time, this learning method is not suitable anymore. Online learning is increasingly used to update the online models from stream data. A rehearsal technique is typically used, which entails re-training the model on a small training set that is updated each time new data is received. The main challenge in this regard is the construction of the training set with appropriate data samples to maintain good model performance. Typically, a random selection of samples is made, which can deteriorate the model's performance. In this paper, we propose a memory-based online incremental learning approach for updating an online classifier that predicts student performance using stream data. The approach is based on the use of the genetic algorithm heuristic while respecting the memory space constraints as well as the balance of class labels. In contrast to random selection, our approach improves the stability of the analytical model by promoting diversity when creating the training set. As a proof of concept, we applied it to the open dataset OULAD. Our approach achieves a notable improvement in model accuracy, with an enhancement of nearly 10% compared to the current state-of-the-art, while maintaining a relatively low standard deviation in accuracy, ranging from 1% to 2.1%.
Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery
This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe, anytime-valid concentration bounds, (Bregman, empirical Chernoff), introduced a new framework for risk-aware contextual bandits (with elicitable risk measures) and analysed a novel class of nonparametric bandit algorithms under weak assumptions (Dirichlet sampling). In addition to the theoretical guarantees, these results are supported by in-depth empirical evidence. Finally, as a first step towards personalised postoperative follow-up recommendations, we developed with medical doctors and surgeons an interpretable machine learning model to predict the long-term weight trajectories of patients after bariatric surgery.
Semantic Objective Functions: A distribution-aware method for adding logical constraints in deep learning
Mendez-Lucero, Miguel Angel, Gallardo, Enrique Bojorquez, Belle, Vaishak
Issues of safety, explainability, and efficiency are of increasing concern in learning systems deployed with hard and soft constraints. Symbolic Constrained Learning and Knowledge Distillation techniques have shown promising results in this area, by embedding and extracting knowledge, as well as providing logical constraints during neural network training. Although many frameworks exist to date, through an integration of logic and information geometry, we provide a construction and theoretical framework for these tasks that generalize many approaches. We propose a loss-based method that embeds knowledge--enforces logical constraints--into a machine learning model that outputs probability distributions. This is done by constructing a distribution from the external knowledge/logic formula, and constructing a loss function as a linear combination of the original loss function with the Fisher-Rao distance or Kullback-Leibler divergence to the constraint distribution. This construction includes logical constraints in the form of propositional formulas (Boolean variables), formulas of a first-order language with finite variables over a model with compact domain (categorical and continuous variables), and in general,likely applicable to any statistical model that was pretrained with semantic information. We evaluate our method on a variety of learning tasks, including classification tasks with logic constraints, transferring knowledge from logic formulas, and knowledge distillation from general distributions.
Automating the Enterprise with Foundation Models
Wornow, Michael, Narayan, Avanika, Opsahl-Ong, Krista, McIntyre, Quinn, Shah, Nigam H., Re, Christopher
Automating enterprise workflows could unlock $4 trillion/year in productivity gains. Despite being of interest to the data management community for decades, the ultimate vision of end-to-end workflow automation has remained elusive. Current solutions rely on process mining and robotic process automation (RPA), in which a bot is hard-coded to follow a set of predefined rules for completing a workflow. Through case studies of a hospital and large B2B enterprise, we find that the adoption of RPA has been inhibited by high set-up costs (12-18 months), unreliable execution (60% initial accuracy), and burdensome maintenance (requiring multiple FTEs). Multimodal foundation models (FMs) such as GPT-4 offer a promising new approach for end-to-end workflow automation given their generalized reasoning and planning abilities. To study these capabilities we propose ECLAIR, a system to automate enterprise workflows with minimal human supervision. We conduct initial experiments showing that multimodal FMs can address the limitations of traditional RPA with (1) near-human-level understanding of workflows (93% accuracy on a workflow understanding task) and (2) instant set-up with minimal technical barrier (based solely on a natural language description of a workflow, ECLAIR achieves end-to-end completion rates of 40%). We identify human-AI collaboration, validation, and self-improvement as open challenges, and suggest ways they can be solved with data management techniques. Code is available at: https://github.com/HazyResearch/eclair-agents
Aloe: A Family of Fine-tuned Open Healthcare LLMs
Gururajan, Ashwin Kumar, Lopez-Cuena, Enrique, Bayarri-Planas, Jordi, Tormos, Adrian, Hinjos, Daniel, Bernabeu-Perez, Pablo, Arias-Duart, Anna, Martin-Torres, Pablo Agustin, Urcelay-Ganzabal, Lucia, Gonzalez-Mallo, Marta, Alvarez-Napagao, Sergio, Ayguadé-Parra, Eduard, Garcia-Gasulla, Ulises Cortés Dario
As the capabilities of Large Language Models (LLMs) in healthcare and medicine continue to advance, there is a growing need for competitive open-source models that can safeguard public interest. With the increasing availability of highly competitive open base models, the impact of continued pre-training is increasingly uncertain. In this work, we explore the role of instruct tuning, model merging, alignment, red teaming and advanced inference schemes, as means to improve current open models. To that end, we introduce the Aloe family, a set of open medical LLMs highly competitive within its scale range. Aloe models are trained on the current best base models (Mistral, LLaMA 3), using a new custom dataset which combines public data sources improved with synthetic Chain of Thought (CoT). Aloe models undergo an alignment phase, becoming one of the first few policy-aligned open healthcare LLM using Direct Preference Optimization, setting a new standard for ethical performance in healthcare LLMs. Model evaluation expands to include various bias and toxicity datasets, a dedicated red teaming effort, and a much-needed risk assessment for healthcare LLMs. Finally, to explore the limits of current LLMs in inference, we study several advanced prompt engineering strategies to boost performance across benchmarks, yielding state-of-the-art results for open healthcare 7B LLMs, unprecedented at this scale.
Interview with Salena Torres Ashton: causality and natural language
In a series of interviews, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we met Salena Torres Ashton and found out about her work focusing on causality and natural language. I am a PhD student at the School of Information at the University of Arizona. Information Science can mean a lot of things, but the easiest way that I like to describe it would be "working with computer science with people in mind".
Layers of technology in pluriversal design. Decolonising language technology with the LiveLanguage initiative
Koch, Gertraud, Bella, Gábor, Helm, Paula, Giunchiglia, Fausto
Language technology has the potential to facilitate intercultural communication through meaningful translations. However, the current state of language technology is deeply entangled with colonial knowledge due to path dependencies and neo-colonial tendencies in the global governance of artificial intelligence (AI). Language technology is a complex and emerging field that presents challenges for co-design interventions due to enfolding in assemblages of global scale and diverse sites and its knowledge intensity. This paper uses LiveLanguage, a lexical database, a set of services with particular emphasis on modelling language diversity and integrating small and minority languages, as an example to discuss and close the gap from pluriversal design theory to practice. By diversifying the concept of emerging technology, we can better approach language technology in global contexts. The paper presents a model comprising of five layers of technological activity. Each layer consists of specific practices and stakeholders, thus provides distinctive spaces for co-design interventions as mode of inquiry for de-linking, re-thinking and re-building language technology towards pluriversality. In that way, the paper contributes to reflecting the position of co-design in decolonising emergent technologies, and to integrating complex theoretical knowledge towards decoloniality into language technology design.