Instructional Material
COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport
Zhang, Linhao, Jin, Li, Xu, Guangluan, Li, Xiaoyu, Sun, Xian
Counter-narratives, which are direct responses consisting of non-aggressive fact-based arguments, have emerged as a highly effective approach to combat the proliferation of hate speech. Previous methodologies have primarily focused on fine-tuning and post-editing techniques to ensure the fluency of generated contents, while overlooking the critical aspects of individualization and relevance concerning the specific hatred targets, such as LGBT groups, immigrants, etc. This research paper introduces a novel framework based on contrastive optimal transport, which effectively addresses the challenges of maintaining target interaction and promoting diversification in generating counter-narratives. Firstly, an Optimal Transport Kernel (OTK) module is leveraged to incorporate hatred target information in the token representations, in which the comparison pairs are extracted between original and transported features. Secondly, a self-contrastive learning module is employed to address the issue of model degeneration. This module achieves this by generating an anisotropic distribution of token representations. Finally, a target-oriented search method is integrated as an improved decoding strategy to explicitly promote domain relevance and diversification in the inference process. This strategy modifies the model's confidence score by considering both token similarity and target relevance. Quantitative and qualitative experiments have been evaluated on two benchmark datasets, which demonstrate that our proposed model significantly outperforms current methods evaluated by metrics from multiple aspects.
Foundation Models for Time Series Analysis: A Tutorial and Survey
Liang, Yuxuan, Wen, Haomin, Nie, Yuqi, Jiang, Yushan, Jin, Ming, Song, Dongjin, Pan, Shirui, Wen, Qingsong
Time series analysis stands as a focal point within the data mining community, serving as a cornerstone for extracting valuable insights crucial to a myriad of real-world applications. Recent advances in Foundation Models (FMs) have fundamentally reshaped the paradigm of model design for time series analysis, boosting various downstream tasks in practice. These innovative approaches often leverage pre-trained or fine-tuned FMs to harness generalized knowledge tailored for time series analysis. This survey aims to furnish a comprehensive and up-to-date overview of FMs for time series analysis. While prior surveys have predominantly focused on either application or pipeline aspects of FMs in time series analysis, they have often lacked an in-depth understanding of the underlying mechanisms that elucidate why and how FMs benefit time series analysis. To address this gap, our survey adopts a methodology-centric classification, delineating various pivotal elements of time-series FMs, including model architectures, pre-training techniques, adaptation methods, and data modalities. Overall, this survey serves to consolidate the latest advancements in FMs pertinent to time series analysis, accentuating their theoretical underpinnings, recent strides in development, and avenues for future exploration.
Windows 11's latest update is kind of insane, in a bad way
Microsoft has split the future of Windows between two user groups: those with AI-powered Copilot PCs and those without. Microsoft now says that the latest version of Windows 11--that's Windows 11 version 24H2--will only be offered to those with Copilot PCs. Microsoft said last Saturday that the company has now resumed rolling out Windows 11 version 24H2 to the Release Preview Channel with Build 26100.863. But in a support note, Microsoft adds: "Important: Windows 11, version 24H2 is only available for Copilot PCs devices." It's an odd choice for a company that has been mildly obsessed with migrating all of its users to a single code base.
How to regularize your regression
Can we learn how to set the regularization parameter from similar domain-specific data? Perhaps the simplest relation between a real dependent variable and a vector of features is a linear model . Given some training examples or datapoints consisting of pairs of features and dependent variables, we would like to learn which would give the best prediction given features of an unseen example. This process of fitting a linear model to the datapoints is called linear regression. This simple yet effective model finds ubiquitous applications, ranging from biological, behavioral, and social sciences to environmental studies and financial forecasting, to make reliable predictions on future data.
A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression
Zhu, Yufan, Khoo, Zi-Yu, Low, Jonathan Sze Choong, Bressan, Stephane
Interleaved practice enhances the memory and problem-solving ability of students in undergraduate courses. We introduce a personalized learning tool built on a Large Language Model (LLM) that can provide immediate and personalized attention to students as they complete homework containing problems interleaved from undergraduate physics courses. Our tool leverages the dimensional analysis method, enhancing students' qualitative thinking and problem-solving skills for complex phenomena. Our approach combines LLMs for symbolic regression with dimensional analysis via prompt engineering and offers students a unique perspective to comprehend relationships between physics variables. This fosters a broader and more versatile understanding of physics and mathematical principles and complements a conventional undergraduate physics education that relies on interpreting and applying established equations within specific contexts. We test our personalized learning tool on the equations from Feynman's lectures on physics. Our tool can correctly identify relationships between physics variables for most equations, underscoring its value as a complementary personalized learning tool for undergraduate physics students.
GUICourse: From General Vision Language Models to Versatile GUI Agents
Chen, Wentong, Cui, Junbo, Hu, Jinyi, Qin, Yujia, Fang, Junjie, Zhao, Yue, Wang, Chongyi, Liu, Jun, Chen, Guirong, Huo, Yupeng, Yao, Yuan, Lin, Yankai, Liu, Zhiyuan, Sun, Maosong
Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (the functions and control methods of GUI elements), preventing them from becoming practical GUI agents. To solve these challenges, we contribute GUICourse, a suite of datasets to train visual-based GUI agents from general VLMs. First, we introduce the GUIEnv dataset to strengthen the OCR and grounding capabilities of VLMs. Then, we introduce the GUIAct and GUIChat datasets to enrich their knowledge of GUI components and interactions. Experiments demonstrate that our GUI agents have better performance on common GUI tasks than their baseline VLMs. Even the small-size GUI agent (with 3.1B parameters) can still work well on single-step and multi-step GUI tasks. Finally, we analyze the different varieties in the training stage of this agent by ablation study. Our source codes and datasets are released at https://github.com/yiye3/GUICourse.
Online Context Learning for Socially-compliant Navigation
Okunevich, Iaroslav, Lombard, Alexandre, Krajnik, Tomas, Ruichek, Yassine, Yan, Zhi
Robot social navigation needs to adapt to different human factors and environmental contexts. However, since these factors and contexts are difficult to predict and cannot be exhaustively enumerated, traditional learning-based methods have difficulty in ensuring the social attributes of robots in long-term and cross-environment deployments. This letter introduces an online context learning method that aims to empower robots to adapt to new social environments online. The proposed method adopts a two-layer structure. The bottom layer is built using a deep reinforcement learning-based method to ensure the output of basic robot navigation commands. The upper layer is implemented using an online robot learning-based method to socialize the control commands suggested by the bottom layer. Experiments using a community-wide simulator show that our method outperforms the state-of-the-art ones. Experimental results in the most challenging scenarios show that our method improves the performance of the state-of-the-art by 8%. The source code of the proposed method, the data used, and the tools for the per-training step will be publicly available at https://github.com/Nedzhaken/SOCSARL-OL.
GPT-Powered Elicitation Interview Script Generator for Requirements Engineering Training
Gรถrer, Binnur, Aydemir, Fatma Baลak
Elicitation interviews are the most common requirements elicitation technique, and proficiency in conducting these interviews is crucial for requirements elicitation. Traditional training methods, typically limited to textbook learning, may not sufficiently address the practical complexities of interviewing techniques. Practical training with various interview scenarios is important for understanding how to apply theoretical knowledge in real-world contexts. However, there is a shortage of educational interview material, as creating interview scripts requires both technical expertise and creativity. To address this issue, we develop a specialized GPT agent for auto-generating interview scripts. The GPT agent is equipped with a dedicated knowledge base tailored to the guidelines and best practices of requirements elicitation interview procedures. We employ a prompt chaining approach to mitigate the output length constraint of GPT to be able to generate thorough and detailed interview scripts. This involves dividing the interview into sections and crafting distinct prompts for each, allowing for the generation of complete content for each section. The generated scripts are assessed through standard natural language generation evaluation metrics and an expert judgment study, confirming their applicability in requirements engineering training.
The Benefits of Power Regularization in Cooperative Reinforcement Learning
Cooperative Multi-Agent Reinforcement Learning (MARL) algorithms, trained only to optimize task reward, can lead to a concentration of power where the failure or adversarial intent of a single agent could decimate the reward of every agent in the system. In the context of teams of people, it is often useful to explicitly consider how power is distributed to ensure no person becomes a single point of failure. Here, we argue that explicitly regularizing the concentration of power in cooperative RL systems can result in systems which are more robust to single agent failure, adversarial attacks, and incentive changes of co-players. To this end, we define a practical pairwise measure of power that captures the ability of any co-player to influence the ego agent's reward, and then propose a power-regularized objective which balances task reward and power concentration. Given this new objective, we show that there always exists an equilibrium where every agent is playing a power-regularized best-response balancing power and task reward. Moreover, we present two algorithms for training agents towards this power-regularized objective: Sample Based Power Regularization (SBPR), which injects adversarial data during training; and Power Regularization via Intrinsic Motivation (PRIM), which adds an intrinsic motivation to regulate power to the training objective. Our experiments demonstrate that both algorithms successfully balance task reward and power, leading to lower power behavior than the baseline of task-only reward and avoid catastrophic events in case an agent in the system goes off-policy.
ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users
Li, Guanlin, Chen, Kangjie, Zhang, Shudong, Zhang, Jie, Zhang, Tianwei
Large-scale pre-trained generative models are taking the world by storm, due to their abilities in generating creative content. Meanwhile, safeguards for these generative models are developed, to protect users' rights and safety, most of which are designed for large language models. Existing methods primarily focus on jailbreak and adversarial attacks, which mainly evaluate the model's safety under malicious prompts. Recent work found that manually crafted safe prompts can unintentionally trigger unsafe generations. To further systematically evaluate the safety risks of text-to-image models, we propose a novel Automatic Red-Teaming framework, ART. Our method leverages both vision language model and large language model to establish a connection between unsafe generations and their prompts, thereby more efficiently identifying the model's vulnerabilities. With our comprehensive experiments, we reveal the toxicity of the popular open-source text-to-image models. The experiments also validate the effectiveness, adaptability, and great diversity of ART. Additionally, we introduce three large-scale red-teaming datasets for studying the safety risks associated with text-to-image models. Datasets and models can be found in https://github.com/GuanlinLee/ART.