Wójcik, Bartosz
Maybe I Should Not Answer That, but... Do LLMs Understand The Safety of Their Inputs?
Chrabąszcz, Maciej, Szatkowski, Filip, Wójcik, Bartosz, Dubiński, Jan, Trzciński, Tomasz
Ensuring the safety of the Large Language Model (LLM) is critical, but currently used methods in most cases sacrifice the model performance to obtain increased safety or perform poorly on data outside of their adaptation distribution. We investigate existing methods for such generalization and find them insufficient. Surprisingly, while even plain LLMs recognize unsafe prompts, they may still generate unsafe responses. To avoid performance degradation and preserve safe performance, we advocate for a two-step framework, where we first identify unsafe prompts via a lightweight classifier, and apply a "safe" model only to such prompts. In particular, we explore the design of the safety detector in more detail, investigating the use of different classifier architectures and prompting techniques. Interestingly, we find that the final hidden state for the last token is enough to provide robust performance, minimizing false positives on benign data while performing well on malicious prompt detection. Additionally, we show that classifiers trained on the representations from different model layers perform comparably on the latest model layers, indicating that safety representation is present in the LLMs' hidden states at most model stages. Our work is a step towards efficient, representation-based safety mechanisms for LLMs.
Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference
Wójcik, Bartosz, Devoto, Alessio, Pustelnik, Karol, Minervini, Pasquale, Scardapane, Simone
The computational cost of transformer models makes them inefficient in low-latency or low-power applications. While techniques such as quantization or linear attention can reduce the computational load, they may incur a reduction in accuracy. In addition, globally reducing the cost for all inputs may be sub-optimal. We observe that for each layer, the full width of the layer may be needed only for a small subset of tokens inside a batch and that the "effective" width needed to process a token can vary from layer to layer. Motivated by this observation, we introduce the Adaptive Computation Module (ACM), a generic module that dynamically adapts its computational load to match the estimated difficulty of the input on a per-token basis. An ACM consists of a sequence of learners that progressively refine the output of their preceding counterparts. An additional gating mechanism determines the optimal number of learners to execute for each token. We also describe a distillation technique to replace any pre-trained model with an "ACMized" variant. The distillation phase is designed to be highly parallelizable across layers while being simple to plug-and-play into existing networks. Our evaluation of transformer models in computer vision and speech recognition demonstrates that substituting layers with ACMs significantly reduces inference costs without degrading the downstream accuracy for a wide interval of user-defined budgets.
Exploiting Transformer Activation Sparsity with Dynamic Inference
Piórczyński, Mikołaj, Szatkowski, Filip, Bałazy, Klaudia, Wójcik, Bartosz
At the same time, previous studies have revealed significant activation sparsity in these models, indicating the presence of redundant computations. In this paper, we propose Dynamic Sparsified Transformer Inference (DSTI), a method that radically reduces the inference cost of Transformer models by enforcing activation sparsity and subsequently transforming a dense model into its sparse Mixture of Experts (MoE) version. We demonstrate that it is possible to train small gating networks that successfully predict the relative contribution of each expert during inference. Furthermore, we introduce a mechanism that dynamically determines the number of executed experts individually for each token. DSTI can be applied to any Transformer-based architecture and has negligible impact on the accuracy. For the BERT-base classification model, we reduce inference cost by almost 60%.
Face Identity-Aware Disentanglement in StyleGAN
Suwała, Adrian, Wójcik, Bartosz, Proszewska, Magdalena, Tabor, Jacek, Spurek, Przemysław, Śmieja, Marek
Conditional GANs are frequently used for manipulating the attributes of face images, such as expression, hairstyle, pose, or age. Even though the state-of-the-art models successfully modify the requested attributes, they simultaneously modify other important characteristics of the image, such as a person's identity. In this paper, we focus on solving this problem by introducing PluGeN4Faces, a plugin to StyleGAN, which explicitly disentangles face attributes from a person's identity. Our key idea is to perform training on images retrieved from movie frames, where a given person appears in various poses and with different attributes. By applying a type of contrastive loss, we encourage the model to group images of the same person in similar regions of latent space. Our experiments demonstrate that the modifications of face attributes performed by PluGeN4Faces are significantly less invasive on the remaining characteristics of the image than in the existing state-of-the-art models.
Computer Vision based inspection on post-earthquake with UAV synthetic dataset
Żarski, Mateusz, Wójcik, Bartosz, Miszczak, Jarosław A., Blachowski, Bartłomiej, Ostrowski, Mariusz
Earthquakes are sudden and violent disasters that cover huge areas of land in a very short period of time. They have been known to mankind since ancient times and invariably pose one of the most serious threats to the lives of people concentrated in large cities. The scale of their destructive power can be seen in the number of nearly two million earthquake victims in the 20th century alone [1], or in the most devastating events, which could claim up to nearly a million lives [2]. At the same time, the map of seismically active areas largely overlaps with densely populated areas, particularly in North America, Europe and Asia [3], which focuses researchers on this type of hazard and methods of its mitigation. Studies conducted to date have assessed the effects of earthquakes both in terms of the impact on housing and infrastructure, and the performance of public services in repairing damage or improving traffic flow in the affected area [4, 5]. These works have led to concepts of cities in which such events will no longer have a critical impact on the lives of residents, but with the cost of monitoring the condition of structures even after seemingly harmless, small earthquakes to take corrective action immediately after damage occurs [6]. This, however, requires the use of modern methods of construction monitoring to reduce the labor intensity of the entire process, without which the end goal is impossible to achieve. In this paper, we present our step towards building autonomous systems that can bring this goal closer.
Hard hat wearing detection based on head keypoint localization
Wójcik, Bartosz, Żarski, Mateusz, Książek, Kamil, Miszczak, Jarosław Adam, Skibniewski, Mirosław Jan
In recent years, a lot of attention is paid to deep learning methods in the context of vision-based construction site safety systems, especially regarding personal protective equipment. However, despite all this attention, there is still no reliable way to establish the relationship between workers and their hard hats. To answer this problem a combination of deep learning, object detection and head keypoint localization, with simple rule-based reasoning is proposed in this article. In tests, this solution surpassed the previous methods based on the relative bounding box position of different instances, as well as direct detection of hard hat wearers and non-wearers. The results show that the conjunction of novel deep learning methods with humanly-interpretable rule-based systems can result in a solution that is both reliable and can successfully mimic manual, on-site supervision. This work is the next step in the development of fully autonomous construction site safety systems and shows that there is still room for improvement in this area.
Adversarial Examples Detection and Analysis with Layer-wise Autoencoders
Wójcik, Bartosz, Morawiecki, Paweł, Śmieja, Marek, Krzyżek, Tomasz, Spurek, Przemysław, Tabor, Jacek
We present a mechanism for detecting adversarial examples based on data representations taken from the hidden layers of the target network. For this purpose, we train individual autoencoders at intermediate layers of the target network. This allows us to describe the manifold of true data and, in consequence, decide whether a given example has the same characteristics as true data. It also gives us insight into the behavior of adversarial examples and their flow through the layers of a deep neural network. Experimental results show that our method outperforms the state of the art in supervised and unsupervised settings.
One-element Batch Training by Moving Window
Spurek, Przemysław, Knop, Szymon, Tabor, Jacek, Podolak, Igor, Wójcik, Bartosz
Several deep models, esp. the generative, compare the samples from two distributions (e.g. WAE like AutoEncoder models, set-processing deep networks, etc) in their cost functions. Using all these methods one cannot train the model directly taking small size (in extreme -- one element) batches, due to the fact that samples are to be compared. We propose a generic approach to training such models using one-element mini-batches. The idea is based on splitting the batch in latent into parts: previous, i.e. historical, elements used for latent space distribution matching and the current ones, used both for latent distribution computation and the minimization process. Due to the smaller memory requirements, this allows to train networks on higher resolution images then in the classical approach.
LOSSGRAD: automatic learning rate in gradient descent
Wójcik, Bartosz, Maziarka, Łukasz, Tabor, Jacek
In this paper, we propose a simple, fast and easy to implement algorithm LOSSGRAD (locally optimal step-size in gradient descent), which automatically modifies the step-size in gradient descent during neural networks training. Given a function $f$, a point $x$, and the gradient $\nabla_x f$ of $f$, we aim to find the step-size $h$ which is (locally) optimal, i.e. satisfies: $$ h=arg\,min_{t \geq 0} f(x-t \nabla_x f). $$ Making use of quadratic approximation, we show that the algorithm satisfies the above assumption. We experimentally show that our method is insensitive to the choice of initial learning rate while achieving results comparable to other methods.