Instructional Material
MeetMap: Real-Time Collaborative Dialogue Mapping with LLMs in Online Meetings
Chen, Xinyue, Yap, Nathan, Lu, Xinyi, Gunal, Aylin, Wang, Xu
Video meeting platforms display conversations linearly through transcripts or summaries. However, ideas during a meeting do not emerge linearly. We leverage LLMs to create dialogue maps in real time to help people visually structure and connect ideas. Balancing the need to reduce the cognitive load on users during the conversation while giving them sufficient control when using AI, we explore two system variants that encompass different levels of AI assistance. In Human-Map, AI generates summaries of conversations as nodes, and users create dialogue maps with the nodes. In AI-Map, AI produces dialogue maps where users can make edits. We ran a within-subject experiment with ten pairs of users, comparing the two MeetMap variants and a baseline. Users preferred MeetMap over traditional methods for taking notes, which aligned better with their mental models of conversations. Users liked the ease of use for AI-Map due to the low effort demands and appreciated the hands-on opportunity in Human-Map for sense-making.
Sparks of Explainability: Recent Advancements in Explaining Large Vision Models
This thesis explores advanced approaches to improve explainability in computer vision by analyzing and modeling the features exploited by deep neural networks. Initially, it evaluates attribution methods, notably saliency maps, by introducing a metric based on algorithmic stability and an approach utilizing Sobol indices, which, through quasi-Monte Carlo sequences, allows a significant reduction in computation time. In addition, the EVA method offers a first formulation of attribution with formal guarantees via verified perturbation analysis. Experimental results indicate that in complex scenarios these methods do not provide sufficient understanding, particularly because they identify only "where" the model focuses without clarifying "what" it perceives. Two hypotheses are therefore examined: aligning models with human reasoning -- through the introduction of a training routine that integrates the imitation of human explanations and optimization within the space of 1-Lipschitz functions -- and adopting a conceptual explainability approach. The CRAFT method is proposed to automate the extraction of the concepts used by the model and to assess their importance, complemented by MACO, which enables their visualization. These works converge towards a unified framework, illustrated by an interactive demonstration applied to the 1000 ImageNet classes in a ResNet model.
"Would You Want an AI Tutor?" Understanding Stakeholder Perceptions of LLM-based Chatbots in the Classroom
Fuligni, Caterina, Figaredo, Daniel Dominguez, Stoyanovich, Julia
In recent years, Large Language Models (LLMs) rapidly gained popularity across all parts of society, including education. After initial skepticism and bans, many schools have chosen to embrace this new technology by integrating it into their curricula in the form of virtual tutors and teaching assistants. However, neither the companies developing this technology nor the public institutions involved in its implementation have set up a formal system to collect feedback from the stakeholders impacted by them. In this paper, we argue that understanding the perceptions of those directly affected by LLMS in the classroom, such as students and teachers, as well as those indirectly impacted, like parents and school staff, is essential for ensuring responsible use of AI in this critical domain. Our contributions are two-fold. First, we present results of a literature review focusing on the perceptions of LLM-based chatbots in education. We highlight important gaps in the literature, such as the exclusion of key educational agents (e.g., parents or school administrators) when analyzing the role of stakeholders, and the frequent omission of the learning contexts in which the AI systems are implemented. Thus, we present a taxonomy that organizes existing literature on stakeholder perceptions. Second, we propose the Contextualized Perceptions for the Adoption of Chatbots in Education (Co-PACE) framework, which can be used to systematically elicit perceptions and inform whether and how LLM-based chatbots should be designed, developed, and deployed in the classroom.
PAC Learning is just Bipartite Matching (Sort of)
The main goal of this article is to convince you, the reader, that supervised learning in the Probably Approximately Correct (PAC) model is closely related to -- of all things -- bipartite matching! En-route from PAC learning to bipartite matching, I will overview a particular transductive model of learning, and associated one-inclusion graphs, which can be viewed as a generalization of some of the hat puzzles that are popular in recreational mathematics. Whereas this transductive model is far from new, it has recently seen a resurgence of interest as a tool for tackling deep questions in learning theory. A secondary purpose of this article could be as a (biased) tutorial on the connections between the PAC and transductive models of learning.
Want to learn Python? Try these interactive lessons
So you want to learn German. Are you going to watch a bunch of YouTube videos and try to brute force your way to fluency? Learning a programming language like Python is a little different from trying to speak a new language, but learning it on your own is still a major challenge. That's why it's useful to check out resources like the Complete Python Certification Boot Camp. This 12-course meal of knowledge is a beginner-friendly way to learn Python, and it's on sale for 19.99.
Stream-Based Monitoring of Algorithmic Fairness
Baumeister, Jan, Finkbeiner, Bernd, Scheerer, Frederik, Siber, Julian, Wagenpfeil, Tobias
Automatic decision and prediction systems are increasingly deployed in applications where they significantly impact the livelihood of people, such as for predicting the creditworthiness of loan applicants or the recidivism risk of defendants. These applications have given rise to a new class of algorithmic-fairness specifications that require the systems to decide and predict without bias against social groups. Verifying these specifications statically is often out of reach for realistic systems, since the systems may, e.g., employ complex learning components, and reason over a large input space. In this paper, we therefore propose stream-based monitoring as a solution for verifying the algorithmic fairness of decision and prediction systems at runtime. Concretely, we present a principled way to formalize algorithmic fairness over temporal data streams in the specification language RTLola and demonstrate the efficacy of this approach on a number of benchmarks. Besides synthetic scenarios that particularly highlight its efficiency on streams with a scaling amount of data, we notably evaluate the monitor on real-world data from the recidivism prediction tool COMPAS.
On-Line Learning for Planning and Control of Underactuated Robots with Uncertain Dynamics
Turrisi, Giulio, Capotondi, Marco, Gaz, Claudio, Modugno, Valerio, Oriolo, Giuseppe, De Luca, Alessandro
Abstract--We present an iterative approach for planning and controlling motions of underactuated robots with uncertain dynamics. At its core, there is a learning process which estimates the perturbations induced by the model uncertainty on the active and passive degrees of freedom. The generic iteration of the algorithm makes use of the learned data in both the planning phase, which is based on optimization, and the control phase, where partial feedback linearization of the active dofs is performed on the model updated on-line. The performance of the proposed approach is shown by comparative simulations and experiments on a Pendubot executing various types of swing-up maneuvers. Very few iterations are typically needed to generate dynamically feasible trajectories and the tracking control that guarantees their accurate execution, even in the presence of large model uncertainties.
adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling Analysis
Rosset, Lorenzo, Netti, Roberto, Muntoni, Anna Paola, Weigt, Martin, Zamponi, Francesco
In this methods article, we provide a flexible but easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning, together with a tutorial on how to use it. The package \texttt{adabmDCA 2.0} is available in different programming languages (C++, Julia, Python) usable on different architectures (single-core and multi-core CPU, GPU) using a common front-end interface. In addition to several learning protocols for dense and sparse generative DCA models, it allows to directly address common downstream tasks like residue-residue contact prediction, mutational-effect prediction, scoring of sequence libraries and generation of artificial sequences for sequence design. It is readily applicable to protein and RNA sequence data.
Autonomy and Safety Assurance in the Early Development of Robotics and Autonomous Systems
Abeywickrama, Dhaminda B., Fisher, Michael, Wheeler, Frederic, Dennis, Louise
This report provides an overview of the workshop titled Autonomy and Safety Assurance in the Early Development of Robotics and Autonomous Systems, hosted by the Centre for Robotic Autonomy in Demanding and Long-Lasting Environments (CRADLE) on September 2, 2024, at The University of Manchester, UK. The event brought together representatives from six regulatory and assurance bodies across diverse sectors to discuss challenges and evidence for ensuring the safety of autonomous and robotic systems, particularly autonomous inspection robots (AIR). The workshop featured six invited talks by the regulatory and assurance bodies. CRADLE aims to make assurance an integral part of engineering reliable, transparent, and trustworthy autonomous systems. Key discussions revolved around three research questions: (i) challenges in assuring safety for AIR; (ii) evidence for safety assurance; and (iii) how assurance cases need to differ for autonomous systems. Following the invited talks, the breakout groups further discussed the research questions using case studies from ground (rail), nuclear, underwater, and drone-based AIR. This workshop offered a valuable opportunity for representatives from industry, academia, and regulatory bodies to discuss challenges related to assured autonomy. Feedback from participants indicated a strong willingness to adopt a design-for-assurance process to ensure that robots are developed and verified to meet regulatory expectations.
International AI Safety Report
Bengio, Yoshua, Mindermann, Sören, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Fox, Philip, Garfinkel, Ben, Goldfarb, Danielle, Heidari, Hoda, Ho, Anson, Kapoor, Sayash, Khalatbari, Leila, Longpre, Shayne, Manning, Sam, Mavroudis, Vasilios, Mazeika, Mantas, Michael, Julian, Newman, Jessica, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Sastry, Girish, Seger, Elizabeth, Skeadas, Theodora, South, Tobin, Strubell, Emma, Tramèr, Florian, Velasco, Lucia, Wheeler, Nicole, Acemoglu, Daron, Adekanmbi, Olubayo, Dalrymple, David, Dietterich, Thomas G., Felten, Edward W., Fung, Pascale, Gourinchas, Pierre-Olivier, Heintz, Fredrik, Hinton, Geoffrey, Jennings, Nick, Krause, Andreas, Leavy, Susan, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John, Munga, Jane, Narayanan, Arvind, Nelson, Alondra, Neppel, Clara, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Schölkopf, Bernhard, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaël, Yao, Andrew, Zhang, Ya-Qin, Albalawi, Fahad, Alserkal, Marwan, Ajala, Olubunmi, Avrin, Guillaume, Busch, Christian, de Carvalho, André Carlos Ponce de Leon Ferreira, Fox, Bronwyn, Gill, Amandeep Singh, Hatip, Ahmet Halit, Heikkilä, Juha, Jolly, Gill, Katzir, Ziv, Kitano, Hiroaki, Krüger, Antonio, Johnson, Chris, Khan, Saif M., Lee, Kyoung Mu, Ligot, Dominic Vincent, Molchanovskyi, Oleksii, Monti, Andrea, Mwamanzi, Nusu, Nemer, Mona, Oliver, Nuria, Portillo, José Ramón López, Ravindran, Balaraman, Rivera, Raquel Pezoa, Riza, Hammam, Rugege, Crystal, Seoighe, Ciarán, Sheehan, Jerry, Sheikh, Haroon, Wong, Denise, Zeng, Yi
I am honoured to present the International AI Safety Report. It is the work of 96 international AI experts who collaborated in an unprecedented effort to establish an internationally shared scientific understanding of risks from advanced AI and methods for managing them. We embarked on this journey just over a year ago, shortly after the countries present at the Bletchley Park AI Safety Summit agreed to support the creation of this report. Since then, we published an Interim Report in May 2024, which was presented at the AI Seoul Summit. We are now pleased to publish the present, full report ahead of the AI Action Summit in Paris in February 2025. Since the Bletchley Summit, the capabilities of general-purpose AI, the type of AI this report focuses on, have increased further. For example, new models have shown markedly better performance at tests of Professor Yoshua Bengio programming and scientific reasoning.