Instructional Material
ProG: A Graph Prompt Learning Benchmark
Zi, Chenyi, Zhao, Haihong, Sun, Xiangguo, Lin, Yiqing, Cheng, Hong, Li, Jia
Artificial general intelligence on graphs has shown significant advancements across various applications, yet the traditional 'Pre-train & Fine-tune' paradigm faces inefficiencies and negative transfer issues, particularly in complex and few-shot settings. Graph prompt learning emerges as a promising alternative, leveraging lightweight prompts to manipulate data and fill the task gap by reformulating downstream tasks to the pretext. However, several critical challenges still remain: how to unify diverse graph prompt models, how to evaluate the quality of graph prompts, and to improve their usability for practical comparisons and selection. In response to these challenges, we introduce the first comprehensive benchmark for graph prompt learning. Our benchmark integrates SIX pre-training methods and FIVE state-of-the-art graph prompt techniques, evaluated across FIFTEEN diverse datasets to assess performance, flexibility, and efficiency. We also present 'ProG', an easy-to-use open-source library that streamlines the execution of various graph prompt models, facilitating objective evaluations. Additionally, we propose a unified framework that categorizes existing graph prompt methods into two main approaches: prompts as graphs and prompts as tokens. This framework enhances the applicability and comparison of graph prompt techniques. The code is available at: https://github.com/sheldonresearch/ProG.
Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation
Li, Yuxi, Liu, Yi, Li, Yuekang, Shi, Ling, Deng, Gelei, Chen, Shengquan, Wang, Kailong
Large language models (LLMs) have transformed the field of natural language processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities to generate unintended and potentially harmful content. Existing token-level jailbreaking techniques, while effective, face scalability and efficiency challenges, especially as models undergo frequent updates and incorporate advanced defensive measures. In this paper, we introduce JailMine, an innovative token-level manipulation approach that addresses these limitations effectively. JailMine employs an automated "mining" process to elicit malicious responses from LLMs by strategically selecting affirmative outputs and iteratively reducing the likelihood of rejection. Through rigorous testing across multiple well-known LLMs and datasets, we demonstrate JailMine's effectiveness and efficiency, achieving a significant average reduction of 86% in time consumed while maintaining high success rates averaging 95%, even in the face of evolving defensive strategies. Our work contributes to the ongoing effort to assess and mitigate the vulnerability of LLMs to jailbreaking attacks, underscoring the importance of continued vigilance and proactive measures to enhance the security and reliability of these powerful language models.
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM
Xu, Wenda, Li, Jiachen, Wang, William Yang, Li, Lei
Direct alignment from preferences (DAP) has emerged as a promising paradigm for aligning large language models (LLMs) to human desiderata from pre-collected, offline preference datasets. While recent studies indicate that existing offline DAP methods can directly benefit from online training samples, we highlight the need to develop specific online DAP algorithms to fully harness the power of online training. Specifically, we identify that the learned LLM should adhere to the proximity of the behavior LLM, which collects the training samples. To this end, we propose online Preference Optimization in proximity to the Behavior LLM (BPO), emphasizing the importance of constructing a proper trust region for LLM alignment. We conduct extensive experiments to validate the effectiveness and applicability of our approach by integrating it with various DAP methods, resulting in significant performance improvements across a wide range of tasks when training with the same amount of preference data. Even when only introducing one additional data collection phase, our online BPO improves its offline DAP baseline from 72.0% to 80.2% on TL;DR and from 82.2% to 89.1% on Anthropic Helpfulness in terms of win rate against human reference text.
Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever
Li, Hang, Xu, Tianlong, Tang, Jiliang, Wen, Qingsong
Knowledge tagging for questions plays a crucial role in contemporary intelligent educational applications, including learning progress diagnosis, practice question recommendations, and course content organization. Traditionally, these annotations are always conducted by pedagogical experts, as the task requires not only a strong semantic understanding of both question stems and knowledge definitions but also deep insights into connecting question-solving logic with corresponding knowledge concepts. With the recent emergence of advanced text encoding algorithms, such as pre-trained language models, many researchers have developed automatic knowledge tagging systems based on calculating the semantic similarity between the knowledge and question embeddings. In this paper, we explore automating the task using Large Language Models (LLMs), in response to the inability of prior encoding-based methods to deal with the hard cases which involve strong domain knowledge and complicated concept definitions. By showing the strong performance of zero- and few-shot results over math questions knowledge tagging tasks, we demonstrate LLMs' great potential in conquering the challenges faced by prior methods. Furthermore, by proposing a reinforcement learning-based demonstration retriever, we successfully exploit the great potential of different-sized LLMs in achieving better performance results while keeping the in-context demonstration usage efficiency high.
Timely Communications for Remote Inference
Shisher, Md Kamran Chowdhury, Sun, Yin, Hou, I-Hong
In this paper, we analyze the impact of data freshness on remote inference systems, where a pre-trained neural network blue infers a time-varying target (e.g., the locations of vehicles and pedestrians) based on features (e.g., video frames) observed at a sensing node (e.g., a camera). One might expect that the performance of a remote inference system degrades monotonically as the feature becomes stale. Using an information-theoretic analysis, we show that this is true if the feature and target data sequence can be closely approximated as a Markov chain, whereas it is not true if the data sequence is far from being Markovian. Hence, the inference error is a function of Age of Information (AoI), where the function could be non-monotonic. To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features, which is more general than the "generate-at-will" model used in earlier studies. In addition, we design low-complexity scheduling policies to improve inference performance. For single-source, single-channel systems, we provide an optimal scheduling policy. In multi-source, multi-channel systems, the scheduling problem becomes a multi-action restless multi-armed bandit problem. For this setting, we design a new scheduling policy by integrating Whittle index-based source selection and duality-based feature selection-from-buffer algorithms. This new scheduling policy is proven to be asymptotically optimal. These scheduling results hold for minimizing general AoI functions (monotonic or non-monotonic). Data-driven evaluations demonstrate the significant advantages of our proposed scheduling policies.
Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions
One of the most natural approaches to reinforcement learning (RL) with function approximation is value iteration, which inductively generates approximations to the optimal value function by solving a sequence of regression problems. To ensure the success of value iteration, it is typically assumed that Bellman completeness holds, which ensures that these regression problems are well-specified. We study the problem of learning an optimal policy under Bellman completeness in the online model of RL with linear function approximation. In the linear setting, while statistically efficient algorithms are known under Bellman completeness (e.g., Jiang et al. (2017); Zanette et al. (2020)), these algorithms all rely on the principle of global optimism which requires solving a nonconvex optimization problem. In particular, it has remained open as to whether computationally efficient algorithms exist. In this paper we give the first polynomial-time algorithm for RL under linear Bellman completeness when the number of actions is any constant.
Demystifying Higher-Order Graph Neural Networks
Besta, Maciej, Scheidl, Florian, Gianinazzi, Lukas, Klaiman, Shachar, Müller, Jürgen, Hoefler, Torsten
Higher-order graph neural networks (HOGNNs) are an important class of GNN models that harness polyadic relations between vertices beyond plain edges. They have been used to eliminate issues such as over-smoothing or over-squashing, to significantly enhance the accuracy of GNN predictions, to improve the expressiveness of GNN architectures, and for numerous other goals. A plethora of HOGNN models have been introduced, and they come with diverse neural architectures, and even with different notions of what the "higher-order" means. This richness makes it very challenging to appropriately analyze and compare HOGNN models, and to decide in what scenario to use specific ones. To alleviate this, we first design an in-depth taxonomy and a blueprint for HOGNNs. This facilitates designing models that maximize performance. Then, we use our taxonomy to analyze and compare the available HOGNN models. The outcomes of our analysis are synthesized in a set of insights that help to select the most beneficial GNN model in a given scenario, and a comprehensive list of challenges and opportunities for further research into more powerful HOGNNs.
Generating Educational Materials with Different Levels of Readability using LLMs
Huang, Chieh-Yang, Wei, Jing, Huang, Ting-Hao 'Kenneth'
We assess the capability of GPT-3.5, LLaMA-2 iterative editing to ensure that the revised texts meet the 70B, and Mixtral 8x7B, to generate content at various readability desired difficulty criteria. This readability assessment is based on levels through zero-shot and few-shot prompting. Evaluating 100 various linguistic features, with sentence length and word frequency processed educational materials reveals that few-shot prompting identified as key factors in previous studies [11]. Although this significantly improves performance in readability manipulation and process appears straightforward, accurately adjusting these elements information preservation. LLaMA-2 70B performs better in achieving to achieve the target reading difficulty is challenging. This the desired difficulty range, while GPT-3.5 maintains original task becomes even more complex for young learners, where factors meaning. However, manual inspection highlights concerns such such as decodability [19], information load [15], and other elements as misinformation introduction and inconsistent edit distribution.
Unsupervised explainable activity prediction in competitive Nordic Walking from experimental data
García-Méndez, Silvia, de Arriba-Pérez, Francisco, González-Castaño, Francisco J., Vales-Alonso, Javier
Artificial Intelligence (AI) has found application in Human Activity Recognition (HAR) in competitive sports. To date, most Machine Learning (ML) approaches for HAR have relied on offline (batch) training, imposing higher computational and tagging burdens compared to online processing unsupervised approaches. Additionally, the decisions behind traditional ML predictors are opaque and require human interpretation. In this work, we apply an online processing unsupervised clustering approach based on low-cost wearable Inertial Measurement Units (IMUs). The outcomes generated by the system allow for the automatic expansion of limited tagging available (e.g., by referees) within those clusters, producing pertinent information for the explainable classification stage. Specifically, our work focuses on achieving automatic explainability for predictions related to athletes' activities, distinguishing between correct, incorrect, and cheating practices in Nordic Walking. The proposed solution achieved performance metrics of close to 100 % on average.
The significance of the configuration space Lie group for the constraint satisfaction in numerical time integration of multibody systems
Mueller, Andreas, Terze, Zdravko
The dynamics simulation of multibody systems (MBS) using spatial velocities (non-holonomic velocities) requires time integration of the dynamics equations together with the kinematic reconstruction equations (relating time derivatives of configuration variables to rigid body velocities). The latter are specific to the geometry of the rigid body motion underlying a particular formulation, and thus to the used configuration space (c-space). The proper c-space of a rigid body is the Lie group SE(3), and the geometry is that of the screw motions. The rigid bodies within a MBS are further subjected to geometric constraints, often due to lower kinematic pairs that define SE(3) subgroups. Traditionally, however, in MBS dynamics the translations and rotations are parameterized independently, which implies the use of the direct product group $SO\left( 3\right) \times {\Bbb R}^{3}$ as rigid body c-space, although this does not account for rigid body motions. Hence, its appropriateness was recently put into perspective. In this paper the significance of the c-space for the constraint satisfaction in numerical time stepping schemes is analyzed for holonomicaly constrained MBS modeled with the 'absolute coordinate' approach, i.e. using the Newton-Euler equations for the individual bodies subjected to geometric constraints. It is shown that the geometric constraints a body is subjected to are exactly satisfied if they constrain the motion to a subgroup of its c-space. Since only the $SE\left( 3\right) $ subgroups have a practical significance it is regarded as the appropriate c-space for the constrained rigid body. Consequently the constraints imposed by lower pair joints are exactly satisfied if the joint connects a body to the ground. For a general MBS, where the motions are not constrained to a subgroup, the SE(3) and $SO\left( 3\right) \times {\Bbb R}^{3}$ yield the same order of accuracy.