Goto

Collaborating Authors

 Instructional Material


Lecture Notes on High Dimensional Linear Regression

arXiv.org Machine Learning

These lecture notes were developed for a Master's course in advanced machine learning at Erasmus University of Rotterdam. The course is designed for graduate students in mathematics, statistics and econometrics. The content follows a proposition-proof structure, making it suitable for students seeking a formal and rigorous understanding of the statistical theory underlying machine learning methods. At present, the notes focus on linear regression, with an in-depth exploration of the existence, uniqueness, relations, computation, and nonasymptotic properties of the most prominent estimators in this setting: least squares, ridgeless, ridge, and lasso. Background It is assumed that readers have a solid background in calculus, linear algebra, convex analysis, and probability theory.


Navigating AI to Unpack Youth Privacy Concerns: An In-Depth Exploration and Systematic Review

arXiv.org Artificial Intelligence

This systematic literature review investigates perceptions, concerns, and expectations of young digital citizens regarding privacy in artificial intelligence (AI) systems, focusing on social media platforms, educational technology, gaming systems, and recommendation algorithms. Using a rigorous methodology, the review started with 2,000 papers, narrowed down to 552 after initial screening, and finally refined to 108 for detailed analysis. Data extraction focused on privacy concerns, data-sharing practices, the balance between privacy and utility, trust factors in AI, transparency expectations, and strategies to enhance user control over personal data. Findings reveal significant privacy concerns among young users, including a perceived lack of control over personal information, potential misuse of data by AI, and fears of data breaches and unauthorized access. These issues are worsened by unclear data collection practices and insufficient transparency in AI applications. The intention to share data is closely associated with perceived benefits and data protection assurances. The study also highlights the role of parental mediation and the need for comprehensive education on data privacy. Balancing privacy and utility in AI applications is crucial, as young digital citizens value personalized services but remain wary of privacy risks. Trust in AI is significantly influenced by transparency, reliability, predictable behavior, and clear communication about data usage. Strategies to improve user control over personal data include access to and correction of data, clear consent mechanisms, and robust data protection assurances. The review identifies research gaps and suggests future directions, such as longitudinal studies, multicultural comparisons, and the development of ethical AI frameworks.


Towards an optimised evaluation of teachers' discourse: The case of engaging messages

arXiv.org Artificial Intelligence

High-quality professional development for teachers can facilitate the learning of best teaching practices, which in turn can lead to higher levels of student performance (Borko et al., 2010; Didion et al., 2020; Gore et al., 2021; Hubers et al., 2022; Schelling & Rubenstein, 2023). For instance, feedback on actual practices has proven effective in enhancing teaching methods and subsequently improving student outcomes (Allen et al., 2011; Gregory et al., 2017), even among students not directly taught by the teachers receiving the feedback (Opper, 2019). Thus, focusing on the evaluation of teaching practices to facilitate professional development is essential, as it can lead to improved teaching methods and ultimately to higher levels of student outcomes. Despite its acknowledged importance and the pressures from high-stakes accountability systems, most professional development opportunities remain fragmented and insufficient to meet teachers' needs (Borko, 2004; Hsu & Malkin, 2013). The reason for this may be that, although it is known that teaching practices such as cognitive activation, supportive climate, and classroom management, are relevant for enhancing teaching quality and student outcomes (Xie & Derakhshan, 2021), these dimensions may be too abstract or general, which can hinder the implementation of concrete actions to improve teaching quality. In this regard, evidence suggests that targeting more specific factors for intervention, rather than abstract ones, allows teachers to better understand and change their practices (Soderberg et al., 2015).


LLM-SEM: A Sentiment-Based Student Engagement Metric Using LLMS for E-Learning Platforms

arXiv.org Artificial Intelligence

Current methods for analyzing student engagement in e-learning platforms, including automated systems, often struggle with challenges such as handling fuzzy sentiment in text comments and relying on limited metadata. Traditional approaches, such as surveys and questionnaires, also face issues like small sample sizes and scalability. In this paper, we introduce LLM-SEM (Language Model-Based Student Engagement Metric), a novel approach that leverages video metadata and sentiment analysis of student comments to measure engagement. By utilizing recent Large Language Models (LLMs), we generate high-quality sentiment predictions to mitigate text fuzziness and normalize key features such as views and likes. Our holistic method combines comprehensive metadata with sentiment polarity scores to gauge engagement at both the course and lesson levels. Extensive experiments were conducted to evaluate various LLM models, demonstrating the effectiveness of LLM-SEM in providing a scalable and accurate measure of student engagement. We fine-tuned TXLM-RoBERTa using human-annotated sentiment datasets to enhance prediction accuracy and utilized LLama 3B, and Gemma 9B from Ollama.


Towards Friendly AI: A Comprehensive Review and New Perspectives on Human-AI Alignment

arXiv.org Artificial Intelligence

As Artificial Intelligence (AI) continues to advance rapidly, Friendly AI (FAI) has been proposed to advocate for more equitable and fair development of AI. Despite its importance, there is a lack of comprehensive reviews examining FAI from an ethical perspective, as well as limited discussion on its potential applications and future directions. This paper addresses these gaps by providing a thorough review of FAI, focusing on theoretical perspectives both for and against its development, and presenting a formal definition in a clear and accessible format. Key applications are discussed from the perspectives of eXplainable AI (XAI), privacy, fairness and affective computing (AC). Additionally, the paper identifies challenges in current technological advancements and explores future research avenues. The findings emphasise the significance of developing FAI and advocate for its continued advancement to ensure ethical and beneficial AI development.


Beyond the Hype: A Comprehensive Review of Current Trends in Generative AI Research, Teaching Practices, and Tools

arXiv.org Artificial Intelligence

Generative AI (GenAI) is advancing rapidly, and the literature in computing education is expanding almost as quickly. Initial responses to GenAI tools were mixed between panic and utopian optimism. Many were fast to point out the opportunities and challenges of GenAI. Researchers reported that these new tools are capable of solving most introductory programming tasks and are causing disruptions throughout the curriculum. These tools can write and explain code, enhance error messages, create resources for instructors, and even provide feedback and help for students like a traditional teaching assistant. In 2024, new research started to emerge on the effects of GenAI usage in the computing classroom. These new data involve the use of GenAI to support classroom instruction at scale and to teach students how to code with GenAI. In support of the former, a new class of tools is emerging that can provide personalized feedback to students on their programming assignments or teach both programming and prompting skills at the same time. With the literature expanding so rapidly, this report aims to summarize and explain what is happening on the ground in computing classrooms. We provide a systematic literature review; a survey of educators and industry professionals; and interviews with educators using GenAI in their courses, educators studying GenAI, and researchers who create GenAI tools to support computing education. The triangulation of these methods and data sources expands the understanding of GenAI usage and perceptions at this critical moment for our community.


Active Inference and Human--Computer Interaction

arXiv.org Artificial Intelligence

Active Inference is a closed-loop computational theoretical basis for understanding behaviour, based on agents with internal probabilistic generative models that encode their beliefs about how hidden states in their environment cause their sensations. We review Active Inference and how it could be applied to model the human-computer interaction loop. Active Inference provides a coherent framework for managing generative models of humans, their environments, sensors and interface components. It informs off-line design and supports real-time, online adaptation. It provides model-based explanations for behaviours observed in HCI, and new tools to measure important concepts such as agency and engagement. We discuss how Active Inference offers a new basis for a theory of interaction in HCI, tools for design of modern, complex sensor-based systems, and integration of artificial intelligence technologies, enabling it to cope with diversity in human users and contexts. We discuss the practical challenges in implementing such Active Inference-based systems.


Scalable and low-cost remote lab platforms: Teaching industrial robotics using open-source tools and understanding its social implications

arXiv.org Artificial Intelligence

With recent advancements in industrial robots, educating students in new technologies and preparing them for the future is imperative. However, access to industrial robots for teaching poses challenges, such as the high cost of acquiring these robots, the safety of the operator and the robot, and complicated training material. This paper proposes two low-cost platforms built using open-source tools like Robot Operating System (ROS) and its latest version ROS 2 to help students learn and test algorithms on remotely connected industrial robots. Universal Robotics (UR5) arm and a custom mobile rover were deployed in different life-size testbeds, a greenhouse, and a warehouse to create an Autonomous Agricultural Harvester System (AAHS) and an Autonomous Warehouse Management System (AWMS). These platforms were deployed for a period of 7 months and were tested for their efficacy with 1,433 and 1,312 students, respectively. The hardware used in AAHS and AWMS was controlled remotely for 160 and 355 hours, respectively, by students over a period of 3 months.


Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

arXiv.org Artificial Intelligence

Concerns about the risks and harms posed by artificial intelligence (AI) have resulted in significant study into algorithmic transparency, giving rise to a sub-field known as Explainable AI (XAI). Unfortunately, despite a decade of development in XAI, an existential challenge remains: progress in research has not been fully translated into the actual implementation of algorithmic transparency by organizations. In this work, we test an approach for addressing the challenge by creating transparency advocates, or motivated individuals within organizations who drive a ground-up cultural shift towards improved algorithmic transparency. Over several years, we created an open-source educational workshop on algorithmic transparency and advocacy. We delivered the workshop to professionals across two separate domains to improve their algorithmic transparency literacy and willingness to advocate for change. In the weeks following the workshop, participants applied what they learned, such as speaking up for algorithmic transparency at an organization-wide AI strategy meeting. We also make two broader observations: first, advocacy is not a monolith and can be broken down into different levels. Second, individuals' willingness for advocacy is affected by their professional field. For example, news and media professionals may be more likely to advocate for algorithmic transparency than those working at technology start-ups.


SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage

arXiv.org Artificial Intelligence

Large language models (LLMs) have made significant advancements across various tasks, but their safety alignment remain a major concern. Exploring jailbreak prompts can expose LLMs' vulnerabilities and guide efforts to secure them. Existing methods primarily design sophisticated instructions for the LLM to follow, or rely on multiple iterations, which could hinder the performance and efficiency of jailbreaks. In this work, we propose a novel jailbreak paradigm, Simple Assistive Task Linkage (SATA), which can effectively circumvent LLM safeguards and elicit harmful responses. Specifically, SATA first masks harmful keywords within a malicious query to generate a relatively benign query containing one or multiple [MASK] special tokens. It then employs a simple assistive task such as a masked language model task or an element lookup by position task to encode the semantics of the masked keywords. Finally, SATA links the assistive task with the masked query to jointly perform the jailbreak. Extensive experiments show that SATA achieves state-of-the-art performance and outperforms baselines by a large margin. Specifically, on AdvBench dataset, with mask language model (MLM) assistive task, SATA achieves an overall attack success rate (ASR) of 85% and harmful score (HS) of 4.57, and with element lookup by position (ELP) assistive task, SATA attains an overall ASR of 76% and HS of 4.43.