Banff
Pointing out the Shortcomings of Relation Extraction Models with Semantically Motivated Adversarials
Nolano, Gennaro, Blum, Moritz, Ell, Basil, Cimiano, Philipp
In recent years, large language models have achieved state-of-the-art performance across various NLP tasks. However, investigations have shown that these models tend to rely on shortcut features, leading to inaccurate predictions and causing the models to be unreliable at generalization to out-of-distribution (OOD) samples. For instance, in the context of relation extraction (RE), we would expect a model to identify the same relation independently of the entities involved in it. For example, consider the sentence "Leonardo da Vinci painted the Mona Lisa" expressing the created(Leonardo da Vinci, Mona Lisa) relation. If we substiute "Leonardo da Vinci" with "Barack Obama", then the sentence still expresses the created relation. A robust model is supposed to detect the same relation in both cases. In this work, we describe several semantically-motivated strategies to generate adversarial examples by replacing entity mentions and investigate how state-of-the-art RE models perform under pressure. Our analyses show that the performance of these models significantly deteriorates on the modified datasets (avg. of -48.5% in F 1), which indicates that these models rely to a great extent on shortcuts, such as surface forms (or patterns therein) of entities, without making full use of the information present in the sentences.
Optimal Zero-Shot Detector for Multi-Armed Attacks
Granese, Federica, Romanelli, Marco, Piantanida, Pablo
Defending signal communication from attackers is a fundamental problem in information theory (Karlof This paper explores a scenario in which a malicious and Wagner, 2003; Perrig et al., 2004). Notably, some actor employs a multi-armed attack attacks are aimed at the physical layer of the communication strategy to manipulate data samples, offering channel, which is responsible for transmitting them various avenues to introduce noise the signal. The goal of such attacks is to generate into the dataset. Our central objective is a denial of service (DoS), which involves disrupting to protect the data by detecting any alterations legitimate communication by causing intentional malfunction to the input. We approach this defensive of the communication channel (Grover et al., strategy with utmost caution, operating 2014). In a typical input perturbation scenario, a malicious in an environment where the defender possesses actor is allowed to detect and alter the signal significantly less information compared before it reaches the communication channel (Sadeghi to the attacker. Specifically, the defender is and Larsson, 2019; Tian et al., 2022). The interest in unable to utilize any data samples for training such attacks has been exacerbated by the growing popularity a defense model or verifying the integrity of machine learning (ML) models, which are of the channel. Instead, the defender relies known to be vulnerable to adversarial attacks (Goodfellow exclusively on a set of pre-existing detectors et al., 2014).
A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing
Gibert, Daniel, Zizzo, Giulio, Le, Quan, Planes, Jordi
Deep learning-based malware detectors have been shown to be susceptible to adversarial malware examples, i.e. malware examples that have been deliberately manipulated in order to avoid detection. In light of the vulnerability of deep learning detectors to subtle input file modifications, we propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing. In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes, rather than using Gaussian noise to randomize inputs like in the Computer Vision (CV) domain. During training, our ablation-based smoothing scheme trains a base classifier to make classifications on a subset of contiguous bytes or chunk of bytes. At test time, a large number of chunks are then classified by a base classifier and the consensus among these classifications is then reported as the final prediction. We propose two strategies to determine the location of the chunks used for classification: (1) randomly selecting the locations of the chunks and (2) selecting contiguous adjacent chunks. To showcase the effectiveness of our approach, we have trained two classifiers with our chunk-based ablation schemes on the BODMAS dataset. Our findings reveal that the chunk-based smoothing classifiers exhibit greater resilience against adversarial malware examples generated with state-of-the-are evasion attacks, outperforming a non-smoothed classifier and a randomized smoothing-based classifier by a great margin.
Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach
Di Francesco, Andrea Giuseppe, Caso, Francesco, Bucarelli, Maria Sofia, Silvestri, Fabrizio
In the past years, Graph Neural Networks (GNNs) have become the `de facto' standard in various deep learning domains, thanks to their flexibility in modeling real-world phenomena represented as graphs. However, the message-passing mechanism of GNNs faces challenges in learnability and expressivity, hindering high performance on heterophilic graphs, where adjacent nodes frequently have different labels. Most existing solutions addressing these challenges are primarily confined to specific benchmarks focused on node classification tasks. This narrow focus restricts the potential impact that link prediction under heterophily could offer in several applications, including recommender systems. For example, in social networks, two users may be connected for some latent reason, making it challenging to predict such connections in advance. Physics-Inspired GNNs such as GRAFF provided a significant contribution to enhance node classification performance under heterophily, thanks to the adoption of physics biases in the message-passing. Drawing inspiration from these findings, we advocate that the methodology employed by GRAFF can improve link prediction performance as well. To further explore this hypothesis, we introduce GRAFF-LP, an extension of GRAFF to link prediction. We evaluate its efficacy within a recent collection of heterophilic graphs, establishing a new benchmark for link prediction under heterophily. Our approach surpasses previous methods, in most of the datasets, showcasing a strong flexibility in different contexts, and achieving relative AUROC improvements of up to 26.7%.
Coercing LLMs to do and reveal (almost) anything
Geiping, Jonas, Stein, Alex, Shu, Manli, Saifullah, Khalid, Wen, Yuxin, Goldstein, Tom
It has recently been shown that adversarial attacks on large language models (LLMs) can'jailbreak' the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and systematize attacks that coerce varied unintended behaviors, such as misdirection, model control, denial-of-service, or data extraction. We analyze these attacks in controlled experiments, and find that many of them stem from the practice of pre-training LLMs with coding capabilities, as well as the continued existence of strange'glitch' tokens in common LLM vocabularies that should be removed for security reasons. We conclude that the spectrum of adversarial attacks on LLMs is much broader than previously thought, and that the security of these models must be addressed through a comprehensive understanding of their capabilities and limitations.")] Some figures and tables below contain profanity or offensive text.
Autonomous Reality Modelling for Cultural Heritage Sites employing cooperative quadrupedal robots and unmanned aerial vehicles
Giakoumidis, Nikolaos, Anagnostopoulos, Christos-Nikolaos
Problem statement During recent years, Reality Modeling (RM) technologies including cutting-edge sensor technologies like Terrestrial and Aerial (using drones) Laser Scanning, have found a wide and prominent purpose in the field of Cultural Heritage (CH) modeling, recording and management. However, RM of CH is a constant challenge for surveyors, since it is a manual-driven, laborious and time-consuming process. The scanning path and sensor's positioning are mostly depended on the surveyor's experience, intuition and perception, since an automatic and systematic procedure does not exist. Taking into consideration the natural environment that surrounds CH sites, the challenges become even more complex. Specifically, for the acquisition of a complete 3D Reality model of a large-scale cultural space, multiple manual terrestrial laser scans (TLS) and aerial scans with UAVs (drones) must be performed. In this manual procedure, the scanning path/strategy and the identification of the scanner position, or Next Best View task as presented in the literature, is mostly depended on the operator's experience and perception. As a result, an optimization of the NBV problem to capture efficiently a large-scale complex sites or monuments in dynamic environments (e.g., due to growing or changing vegetation) is quite important to minimize the surveying time and scanning cost. Although the NBV problem is crucial, efficiency and optimality have not been considered qualitatively and explicitly so far in the literature, and thus, in some cases, the surveying process takes usually longer than necessary, since some regions are overlapped unnecessary and extra positionings are planned just to be on the safe side.
0378c7692da36807bdec87ab043cdadc-Supplemental-Datasets_and_Benchmarks.pdf
While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear. We contribute extensive benchmarks of standard and novel deep learning methods as well as tree-based models such as XGBoost and Random Forests, across a large number of datasets and hyperparameter combinations. We define a standard set of 45 datasets from varied domains with clear characteristics of tabular data and a benchmarking methodology accounting for both fitting models and finding good hyperparameters. Results show that tree-based models remain state-of-the-art on medium-sized data ( 10K samples) even without accounting for their superior speed. To understand this gap, we conduct an empirical investigation into the differing inductive biases of tree-based models and neural networks. This leads to a series of challenges which should guide researchers aiming to build tabular-specific neural network: 1. be robust to uninformative features, 2. preserve the orientation of the data, and 3. be able to easily learn irregular functions. To stimulate research on tabular architectures, we contribute a standard benchmark and raw data for baselines: every point of a 20 000 compute hours hyperparameter search for each learner.
0378c7692da36807bdec87ab043cdadc-Paper-Datasets_and_Benchmarks.pdf
While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear. We contribute extensive benchmarks of standard and novel deep learning methods as well as tree-based models such as XGBoost and Random Forests, across a large number of datasets and hyperparameter combinations. We define a standard set of 45 datasets from varied domains with clear characteristics of tabular data and a benchmarking methodology accounting for both fitting models and finding good hyperparameters. Results show that tree-based models remain state-of-the-art on medium-sized data ( 10K samples) even without accounting for their superior speed. To understand this gap, we conduct an empirical investigation into the differing inductive biases of tree-based models and neural networks. This leads to a series of challenges which should guide researchers aiming to build tabular-specific neural network: 1. be robust to uninformative features, 2. preserve the orientation of the data, and 3. be able to easily learn irregular functions. To stimulate research on tabular architectures, we contribute a standard benchmark and raw data for baselines: every point of a 20 000 compute hours hyperparameter search for each learner.
Quantized Embedding Vectors for Controllable Diffusion Language Models
Kang, Cheng, Chen, Xinye, Hu, Yong, Novak, Daniel
Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.
EntailE: Introducing Textual Entailment in Commonsense Knowledge Graph Completion
Su, Ying, Fang, Tianqing, Xiao, Huiru, Wang, Weiqi, Song, Yangqiu, Zhang, Tong, Chen, Lei
Commonsense knowledge graph completion is a new challenge for commonsense knowledge graph construction and application. In contrast to factual knowledge graphs such as Freebase and YAGO, commonsense knowledge graphs (CSKGs; e.g., ConceptNet) utilize free-form text to represent named entities, short phrases, and events as their nodes. Such a loose structure results in large and sparse CSKGs, which makes the semantic understanding of these nodes more critical for learning rich commonsense knowledge graph embedding. While current methods leverage semantic similarities to increase the graph density, the semantic plausibility of the nodes and their relations are under-explored. Previous works adopt conceptual abstraction to improve the consistency of modeling (event) plausibility, but they are not scalable enough and still suffer from data sparsity. In this paper, we propose to adopt textual entailment to find implicit entailment relations between CSKG nodes, to effectively densify the subgraph connecting nodes within the same conceptual class, which indicates a similar level of plausibility. Each node in CSKG finds its top entailed nodes using a finetuned transformer over natural language inference (NLI) tasks, which sufficiently capture textual entailment signals. The entailment relation between these nodes are further utilized to: 1) build new connections between source triplets and entailed nodes to densify the sparse CSKGs; 2) enrich the generalization ability of node representations by comparing the node embeddings with a contrastive loss. Experiments on two standard CSKGs demonstrate that our proposed framework EntailE can improve the performance of CSKG completion tasks under both transductive and inductive settings.