Goto

Collaborating Authors

 Generative AI


Teacher agency in the age of generative AI: towards a framework of hybrid intelligence for learning design

arXiv.org Artificial Intelligence

Generative AI (genAI) is being used in education for different purposes. From the teachers' perspective, genAI can support activities such as learning design. However, there is a need to study the impact of genAI on the teachers' agency. While GenAI can support certain processes of idea generation and co-creation, GenAI has the potential to negatively affect professional agency due to teachers' limited power to (i) act, (ii) affect matters, and (iii) make decisions or choices, as well as the possibility to (iv) take a stance. Agency is identified in the learning sciences studies as being one of the factors in teachers' ability to trust AI. This paper aims to introduce a dual perspective. First, educational technology, as opposed to other computer-mediated communication (CMC) tools, has two distinctly different user groups and different user needs, in the form of learners and teachers, to cater for. Second, the design of educational technology often prioritises learner agency and engagement, thereby limiting the opportunities for teachers to influence the technology and take action. This study aims to analyse the way GenAI is influencing teachers' agency. After identifying the current limits of GenAI, a solution based on the combination of human intelligence and artificial intelligence through a hybrid intelligence approach is proposed. This combination opens up the discussion of a collaboration between teacher and genAI being able to open up new practices in learning design in which they HI support the extension of the teachers' activity.


From sketches to a robot with artificial intelligence

AIHub

How do you develop a product with as little human involvement as possible? Linköping University students built a robot using generative artificial intelligence. From an initial idea with AI-generated images to the final stages of optimisation, generative AI (GAI) supported the students throughout the design process. "I mainly used ChatGPT to program the robot's navigation and control," says Arad Jafari, master's student in technical design. The robot is similar to a radio-controlled car, with a small arm and a grapple claw in the front.


T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

arXiv.org Artificial Intelligence

The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus on the quality of video generation. While some evaluations of text-to-image models have considered safety, they cover fewer aspects and do not address the unique temporal risk inherent in video generation. To bridge this research gap, we introduce T2VSafetyBench, a new benchmark designed for conducting safety-critical assessments of text-to-video models. We define 12 critical aspects of video generation safety and construct a malicious prompt dataset using LLMs and jailbreaking prompt attacks. Based on our evaluation results, we draw several important findings, including: 1) no single model excels in all aspects, with different models showing various strengths; 2) the correlation between GPT-4 assessments and manual reviews is generally high; 3) there is a trade-off between the usability and safety of text-to-video generative models. This indicates that as the field of video generation rapidly advances, safety risks are set to surge, highlighting the urgency of prioritizing video safety. We hope that T2VSafetyBench can provide insights for better understanding the safety of video generation in the era of generative AI.


The Interplay of Learning, Analytics, and Artificial Intelligence in Education: A Vision for Hybrid Intelligence

arXiv.org Artificial Intelligence

This paper presents a multi-dimensional view of AI's role in learning and education, emphasizing the intricate interplay between AI, analytics, and the learning processes. Here, I challenge the prevalent narrow conceptualisation of AI as tools, as exemplified in generative AI tools, and argue for the importance of alternative conceptualisations of AI for achieving human-AI hybrid intelligence. I highlight the differences between human intelligence and artificial information processing, the importance of hybrid human-AI systems to extend human cognition, and posit that AI can also serve as an instrument for understanding human learning. Early learning sciences and AI in Education research (AIED), which saw AI as an analogy for human intelligence, have diverged from this perspective, prompting a need to rekindle this connection. The paper presents three unique conceptualisations of AI: the externalization of human cognition, the internalization of AI models to influence human mental models, and the extension of human cognition via tightly coupled human-AI hybrid intelligence systems. Examples from current research and practice are examined as instances of the three conceptualisations in education, highlighting the potential value and limitations of each conceptualisation for education, as well as the perils of overemphasis on externalising human cognition. The paper concludes with advocacy for a broader approach to AIED that goes beyond considerations on the design and development of AI, but also includes educating people about AI and innovating educational systems to remain relevant in an AI-ubiquitous world.


Replication in Visual Diffusion Models: A Survey and Outlook

arXiv.org Artificial Intelligence

Abstract--Visual diffusion models have revolutionized the field of creative AI, producing high-quality and diverse content. However, they inevitably memorize training images or videos, subsequently replicating their concepts, content, or styles during inference. In this survey, we provide the first comprehensive review of replication in visual diffusion models, marking a novel contribution to the field by systematically categorizing the existing studies into unveiling, understanding, and mitigating this phenomenon. Specifically, unveiling mainly refers to the methods used to detect replication instances. Understanding involves analyzing the underlying mechanisms and factors that contribute to this phenomenon. Mitigation focuses on developing strategies to reduce or eliminate replication. Beyond these aspects, we also review papers focusing on its real-world influence. For instance, in the context of healthcare, replication is critically worrying due to privacy concerns related to patient data. Finally, the paper concludes with a discussion of the ongoing challenges, such as the difficulty in detecting and benchmarking replication, and outlines future directions including the development of more robust mitigation techniques. By synthesizing insights from diverse studies, this paper aims to equip researchers and practitioners with a deeper understanding at the intersection between AI technology and social good. Compared to traditional Generative Adversarial Networks (GAN) [3] and Variational Autoencoders (VAE) [4], visual diffusion models excel in producing high-quality, diverse, and stable images.


The infrastructure powering IBM's Gen AI model development

arXiv.org Artificial Intelligence

AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings.


A Blueprint for Auditing Generative AI

arXiv.org Artificial Intelligence

The widespread use of generative AI systems is coupled with significant ethical and social challenges. As a result, policymakers, academic researchers, and social advocacy groups have all called for such systems to be audited. However, existing auditing procedures fail to address the governance challenges posed by generative AI systems, which display emergent capabilities and are adaptable to a wide range of downstream tasks. In this chapter, we address that gap by outlining a novel blueprint for how to audit such systems. Specifically, we propose a three-layered approach, whereby governance audits (of technology providers that design and disseminate generative AI systems), model audits (of generative AI systems after pre-training but prior to their release), and application audits (of applications based on top of generative AI systems) complement and inform each other. We show how audits on these three levels, when conducted in a structured and coordinated manner, can be a feasible and effective mechanism for identifying and managing some of the ethical and social risks posed by generative AI systems. That said, it is important to remain realistic about what auditing can reasonably be expected to achieve. For this reason, the chapter also discusses the limitations not only of our three-layered approach but also of the prospect of auditing generative AI systems at all. Ultimately, this chapter seeks to expand the methodological toolkit available to technology providers and policymakers who wish to analyse and evaluate generative AI systems from technical, ethical, and legal perspectives.


Collective Innovation in Groups of Large Language Models

arXiv.org Artificial Intelligence

Human culture relies on collective innovation: our ability to continuously explore how existing elements in our environment can be combined to create new ones. Language is hypothesized to play a key role in human culture, driving individual cognitive capacities and shaping communication. Yet the majority of models of collective innovation assign no cognitive capacities or language abilities to agents. Here, we contribute a computational study of collective innovation where agents are Large Language Models (LLMs) that play Little Alchemy 2, a creative video game originally developed for humans that, as we argue, captures useful aspects of innovation landscapes not present in previous test-beds. We, first, study an LLM in isolation and discover that it exhibits both useful skills and crucial limitations. We, then, study groups of LLMs that share information related to their behaviour and focus on the effect of social connectivity on collective performance. In agreement with previous human and computational studies, we observe that groups with dynamic connectivity out-compete fully-connected groups. Our work reveals opportunities and challenges for future studies of collective innovation that are becoming increasingly relevant as Generative Artificial Intelligence algorithms and humans innovate alongside each other.


AI Safety in Generative AI Large Language Models: A Survey

arXiv.org Artificial Intelligence

Large Language Model (LLMs) such as ChatGPT that exhibit generative AI capabilities are facing accelerated adoption and innovation. The increased presence of Generative AI (GAI) inevitably raises concerns about the risks and safety associated with these models. This article provides an up-to-date survey of recent trends in AI safety research of GAI-LLMs from a computer scientist's perspective: specific and technical. In this survey, we explore the background and motivation for the identified harms and risks in the context of LLMs being generative language models; our survey differentiates by emphasising the need for unified theories of the distinct safety challenges in the research development and applications of LLMs. We start our discussion with a concise introduction to the workings of LLMs, supported by relevant literature. Then we discuss earlier research that has pointed out the fundamental constraints of generative models, or lack of understanding thereof (e.g., performance and safety trade-offs as LLMs scale in number of parameters). We provide a sufficient coverage of LLM alignment -- delving into various approaches, contending methods and present challenges associated with aligning LLMs with human preferences. By highlighting the gaps in the literature and possible implementation oversights, our aim is to create a comprehensive analysis that provides insights for addressing AI safety in LLMs and encourages the development of aligned and secure models. We conclude our survey by discussing future directions of LLMs for AI safety, offering insights into ongoing research in this critical area.


Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course

arXiv.org Artificial Intelligence

Using large language models (LLMs) for automatic evaluation has become an important evaluation method in NLP research. However, it is unclear whether these LLM-based evaluators can be applied in real-world classrooms to assess student assignments. This empirical report shares how we use GPT-4 as an automatic assignment evaluator in a university course with 1,028 students. Based on student responses, we find that LLM-based assignment evaluators are generally acceptable to students when students have free access to these LLM-based evaluators. However, students also noted that the LLM sometimes fails to adhere to the evaluation instructions. Additionally, we observe that students can easily manipulate the LLM-based evaluator to output specific strings, allowing them to achieve high scores without meeting the assignment rubric. Based on student feedback and our experience, we provide several recommendations for integrating LLM-based evaluators into future classrooms.