Instructional Material
Gen-IR @ SIGIR 2023: The First Workshop on Generative Information Retrieval
Bénédict, Gabriel, Zhang, Ruqing, Metzler, Donald
Generative information retrieval (IR) has experienced substantial growth across multiple research communities (e.g., information retrieval, computer vision, natural language processing, and machine learning), and has been highly visible in the popular press. Theoretical, empirical, and actual user-facing products have been released that retrieve documents (via generation) or directly generate answers given an input request. We would like to investigate whether end-to-end generative models are just another trend or, as some claim, a paradigm change for IR. This necessitates new metrics, theoretical grounding, evaluation methods, task definitions, models, user interfaces, etc. The goal of this workshop (https://coda.io/@sigir/gen-ir) is to focus on previously explored Generative IR techniques like document retrieval and direct Grounded Answer Generation, while also offering a venue for the discussion and exploration of how Generative IR can be applied to new domains like recommendation systems, summarization, etc. The format of the workshop is interactive, including roundtable and keynote sessions and tends to avoid the one-sided dialogue of a mini-conference.
Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling
Shao, Shitong, Dai, Xu, Yin, Shouyi, Li, Lujun, Chen, Huanran, Hu, Yang
Diffusion Probability Models (DPMs) have made impressive advancements in various machine learning domains. However, achieving high-quality synthetic samples typically involves performing a large number of sampling steps, which impedes the possibility of real-time sample synthesis. Traditional accelerated sampling algorithms via knowledge distillation rely on pre-trained model weights and discrete time step scenarios, necessitating additional training sessions to achieve their goals. To address these issues, we propose the Catch-Up Distillation (CUD), which encourages the current moment output of the velocity estimation model ``catch up'' with its previous moment output. Specifically, CUD adjusts the original Ordinary Differential Equation (ODE) training objective to align the current moment output with both the ground truth label and the previous moment output, utilizing Runge-Kutta-based multi-step alignment distillation for precise ODE estimation while preventing asynchronous updates. Furthermore, we investigate the design space for CUDs under continuous time-step scenarios and analyze how to determine the suitable strategies. To demonstrate CUD's effectiveness, we conduct thorough ablation and comparison experiments on CIFAR-10, MNIST, and ImageNet-64. On CIFAR-10, we obtain a FID of 2.80 by sampling in 15 steps under one-session training and the new state-of-the-art FID of 3.37 by sampling in one step with additional training. This latter result necessitated only 620k iterations with a batch size of 128, in contrast to Consistency Distillation, which demanded 2100k iterations with a larger batch size of 256. Our code is released at https://anonymous.4open.science/r/Catch-Up-Distillation-E31F.
UIILD: A Unified Interpretable Intelligent Learning Diagnosis Framework for Intelligent Tutoring Systems
Wang, Zhifeng, Yan, Wenxing, Zeng, Chunyan, Dong, Shi
Intelligent learning diagnosis is a critical engine of intelligent tutoring systems, which aims to estimate learners' current knowledge mastery status and predict their future learning performance. The significant challenge with traditional learning diagnosis methods is the inability to balance diagnostic accuracy and interpretability. Although the existing psychometric-based learning diagnosis methods provide some domain interpretation through cognitive parameters, they have insufficient modeling capability with a shallow structure for large-scale learning data. While the deep learning-based learning diagnosis methods have improved the accuracy of learning performance prediction, their inherent black-box properties lead to a lack of interpretability, making their results untrustworthy for educational applications. To settle the above problem, the proposed unified interpretable intelligent learning diagnosis (UIILD) framework, which benefits from the powerful representation learning ability of deep learning and the interpretability of psychometrics, achieves a better performance of learning prediction and provides interpretability from three aspects: cognitive parameters, learner-resource response network, and weights of self-attention mechanism. Within the proposed framework, this paper presents a two-channel learning diagnosis mechanism LDM-ID as well as a three-channel learning diagnosis mechanism LDM-HMI. Experiments on two real-world datasets and a simulation dataset show that our method has higher accuracy in predicting learners' performances compared with the state-of-the-art models, and can provide valuable educational interpretability for applications such as precise learning resource recommendation and personalized learning tutoring in intelligent tutoring systems.
Assigning AI: Seven Approaches for Students, with Prompts
Mollick, Ethan, Mollick, Lilach
Abstract: This paper examines the transformative role of Large Language Models (LLMs) in education and their potential as learning tools, despite their inherent risks and limitations. The authors propose seven approaches for utilizing AI in classrooms: AI-tutor, AI-coach, AI-mentor, AI-teammate, AI-tool, AIsimulator, and AI-student, each with distinct pedagogical benefits and risks. The aim is to help students learn with and about AI, with practical strategies designed to mitigate risks such as complacency about the AI's output, errors, and biases. These strategies promote active oversight, critical assessment of AI outputs, and complementation of AI's capabilities with the students' unique insights. By challenging students to remain the "human in the loop", the authors aim to enhance learning outcomes while ensuring that AI serves as a supportive tool rather than a replacement. The proposed framework offers a guide for educators navigating the integration of AI-assisted learning in ...
LIVABLE: Exploring Long-Tailed Classification of Software Vulnerability Types
Wen, Xin-Cheng, Gao, Cuiyun, Luo, Feng, Wang, Haoyu, Li, Ge, Liao, Qing
Prior studies generally focus on software vulnerability detection and have demonstrated the effectiveness of Graph Neural Network (GNN)-based approaches for the task. Considering the various types of software vulnerabilities and the associated different degrees of severity, it is also beneficial to determine the type of each vulnerable code for developers. In this paper, we observe that the distribution of vulnerability type is long-tailed in practice, where a small portion of classes have massive samples (i.e., head classes) but the others contain only a few samples (i.e., tail classes). Directly adopting previous vulnerability detection approaches tends to result in poor detection performance, mainly due to two reasons. First, it is difficult to effectively learn the vulnerability representation due to the over-smoothing issue of GNNs. Second, vulnerability types in tails are hard to be predicted due to the extremely few associated samples.To alleviate these issues, we propose a Long-taIled software VulnerABiLity typE classification approach, called LIVABLE. LIVABLE mainly consists of two modules, including (1) vulnerability representation learning module, which improves the propagation steps in GNN to distinguish node representations by a differentiated propagation method. A sequence-to-sequence model is also involved to enhance the vulnerability representations. (2) adaptive re-weighting module, which adjusts the learning weights for different types according to the training epochs and numbers of associated samples by a novel training loss.
Saltation Matrices: The Essential Tool for Linearizing Hybrid Dynamical Systems
Kong, Nathan J., Payne, J. Joe, Zhu, James, Johnson, Aaron M.
I Figure 1: An example 2 mode hybrid system where the domains are shown in black circles D, the dynamics are shown with gray arrows F, the guard for the current domain is shown in red dashed g, and the reset from the current mode to the next mode is shown in blue R. The saltation matrix relies on differentiating the guards B. Saltation matrix derivation and resets so they must be differentiable. Excluding Zeno In this section, the derivation of the saltation matrix (2) is conditions ensures we avoid computing infinite saltation matrices presented, following the geometric derivation from [10] with in finite time, which would clearly be unsound for the addition of reset maps. There are many alternate ways analysis. Transversality ensures that neighboring trajectories to derive (2): a derivation using the chain rule is included in impact the same guard unless the impact point lies on any Appendix A and a derivation using a double limit can be found other guard surface, in which case the Bouligand derivative in [96]. is the appropriate analysis tool [52, 114-117]. Transversality Suppose the nominal trajectory of interest is x(t) as shown also ensures the denominator in (2) does not approach zero. in Figure 1. The trajectory starts in mode I and goes through a In some cases, the saltation matrix for a hybrid transition hybrid transition to mode J at time t. The saltation matrix is a can become an identity transformation.
Towards Applying Powerful Large AI Models in Classroom Teaching: Opportunities, Challenges and Prospects
Tan, Kehui, Pang, Tianqi, Fan, Chenyou, Yu, Song
This perspective paper proposes a series of interactive scenarios that utilize Artificial Intelligence (AI) to enhance classroom teaching, such as dialogue auto-completion, knowledge and style transfer, and assessment of AI-generated content. By leveraging recent developments in Large Language Models (LLMs), we explore the potential of AI to augment and enrich teacher-student dialogues and improve the quality of teaching. Our goal is to produce innovative and meaningful conversations between teachers and students, create standards for evaluation, and improve the efficacy of AI-for-Education initiatives. In Section 3, we discuss the challenges of utilizing existing LLMs to effectively complete the educated tasks and present a unified framework for addressing diverse education dataset, processing lengthy conversations, and condensing information to better accomplish more downstream tasks. In Section 4, we summarize the pivoting tasks including Teacher-Student Dialogue Auto-Completion, Expert Teaching Knowledge and Style Transfer, and Assessment of AI-Generated Content (AIGC), providing a clear path for future research. In Section 5, we also explore the use of external and adjustable LLMs to improve the generated content through human-in-the-loop supervision and reinforcement learning. Ultimately, this paper seeks to highlight the potential for AI to aid the field of education and promote its further exploration.
HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models
Controlling the text generated by language models and customizing the content has been a long-standing challenge. Existing prompting techniques proposed in pursuit of providing control are task-specific and lack generality; this provides overwhelming choices for non-expert users to find a suitable method for their task. The effort associated with those techniques, such as in writing examples, explanations, instructions, etc. further limits their adoption among non-expert users. In this paper, we propose a simple prompting strategy HELP ME THINK where we encourage GPT3 to help non-expert users by asking a set of relevant questions and leveraging user answers to execute the task. We demonstrate the efficacy of our technique HELP ME THINK on a variety of tasks. Specifically, we focus on tasks that are hard for average humans and require significant thinking to perform. We hope our work will encourage the development of unconventional ways to harness the power of large language models.
Toward Terrain-based Navigation Using Side-scan Sonar
Davenport, Ellen, Jang, Junsu, Meyer, Florian
This paper introduces a statistical model and corresponding sequential Bayesian estimation method for terrain-based navigation using side-scan sonar (SSS) data. The presented approach relies on slant range measurements extracted from the received ping of a SSS. In particular, incorporating slant range measurements to landmarks for navigation constrains the location and altitude error of an autonomous platform in GPS-denied environments. The proposed navigation filter consists of a prediction step based on the unscented transform and an update step that relies on particle filtering. The SSS measurement model aims to capture the highly nonlinear nature of SSS data while maintaining reasonable computational requirements in the particle-based update step. For our numerical results, we assume a scenario with a surface vehicle that performs SSS and compass measurements. The simulated scenario is consistent with our current hardware platform. We also discuss how the proposed method can be extended to autonomous underwater vehicles (AUVs) in a straightforward way and why the combination of SSS sensor and compass is particularly suitable for small autonomous platforms.
Learnersourcing in the Age of AI: Student, Educator and Machine Partnerships for Content Creation
Khosravi, Hassan, Denny, Paul, Moore, Steven, Stamper, John
Our increasingly connected world is empowering learners and enabling exciting new pedagogies. In particular, educational tools that facilitate collaboration between students can help to foster a wide range of social and domainspecific skills (Jeong, Hmelo-Silver and Jo, 2019). The literature on computer supported collaborative learning documents a diverse range of pedagogies that have been applied for decades in many subject domains and educational levels (Lehtinen, Hakkarainen, Lipponen, Rahikainen and Muukkonen, 1999; Roberts, 2005; Kaliisa, Rienties, Mørch and Kluge, 2022). One recent approach, derived from foundational work on contributing student pedagogies (Collis and Moonen, 2002; Hamer, Sheard, Purchase and Luxton-Reilly, 2012), involves students creating and sharing learning resources with one another. Such activities have gained popularity in recent years and are associated with two broad types of benefits. Firstly, creating learning content is a cognitively demanding task that requires students to engage deeply with course concepts and exhibit behaviours at the highest level of Bloom's taxonomy of educational objectives (Hilton, Goldwater, Hancock, Clemson, Huang and Denyer, 2022). Secondly, leveraging the creative power of many students can result in the rapid and cost-effective creation of large repositories of learning resources that can, in turn, be used for practice and to support personalized learning experiences (Singh, Brooks, Lin and Li, 2021). Learnersourcing is a commonly used term to describe the practice of having students work collaboratively to generate shared learning resources (Kim, 2015). It is related to the more general task of crowdsourcing, in which tasks are outsourced to a pool of participants, often drawn from large and undefined populations, each of whom makes a small contribution to some product.