Goto

Collaborating Authors

 conceptor


Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering

Postmus, Joris, Abreu, Steven

arXiv.org Artificial Intelligence

Large language models have transformed AI, yet reliably controlling their outputs remains a challenge. This paper explores activation engineering, where outputs of pre-trained LLMs are controlled by manipulating their activations at inference time. Unlike traditional methods using a single steering vector, we introduce conceptors - mathematical constructs that represent sets of activation vectors as ellipsoidal regions. Conceptors act as soft projection matrices and offer more precise control over complex activation patterns. Our experiments demonstrate that conceptors outperform traditional methods across multiple steering tasks. We further use Boolean operations on conceptors for combined steering goals that empirically outperform additively combining steering vectors on a set of tasks. These results highlight conceptors as a promising tool for more effective steering of LLMs. Our code is available on github.com/jorispos/conceptorsteering.


Adaptive control of recurrent neural networks using conceptors

Pourcel, Guillaume, Goldmann, Mirko, Fischer, Ingo, Soriano, Miguel C.

arXiv.org Artificial Intelligence

Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities. After the training, the network parameters are kept fixed to exploit the learned computations. The static parameters thereby render the network unadaptive to changing conditions, such as external or internal perturbation. In this manuscript, we demonstrate how keeping parts of the network adaptive even after the training enhances its functionality and robustness. Here, we utilize the conceptor framework and conceptualize an adaptive control loop analyzing the network's behavior continuously and adjusting its time-varying internal representation to follow a desired target. We demonstrate how the added adaptivity of the network supports the computational functionality in three distinct tasks: interpolation of temporal patterns, stabilization against partial network degradation, and robustness against input distortion. Our results highlight the potential of adaptive networks in machine learning beyond training, enabling them to not only learn complex patterns but also dynamically adjust to changing environments, ultimately broadening their applicability.

  Country:
  Genre: Research Report > New Finding (0.66)
  Industry: Health & Medicine (0.46)

Conceptor Learning for Class Activation Mapping

Qian, Guangwu, Yang, Zhen-Qun, Zhang, Xu-Lu, Wang, Yaowei, Li, Qing, Wei, Xiao-Yong

arXiv.org Artificial Intelligence

Class Activation Mapping (CAM) has been widely adopted to generate saliency maps which provides visual explanations for deep neural networks (DNNs). The saliency maps are conventionally generated by fusing the channels of the target feature map using a weighted average scheme. It is a weak model for the inter-channel relation, in the sense that it only models the relation among channels in a contrastive way (i.e., channels that play key roles in the prediction are given higher weights for them to stand out in the fusion). The collaborative relation, which makes the channels work together to provide cross reference, has been ignored. Furthermore, the model has neglected the intra-channel relation thoroughly.In this paper, we address this problem by introducing Conceptor learning into CAM generation. Conceptor leaning has been originally proposed to model the patterns of state changes in recurrent neural networks (RNNs). By relaxing the dependency of Conceptor learning to RNNs, we make Conceptor-CAM not only generalizable to more DNN architectures but also able to learn both the inter- and intra-channel relations for better saliency map generation. Moreover, we have enabled the use of Boolean operations to combine the positive and pseudo-negative evidences, which has made the CAM inference more robust and comprehensive. The effectiveness of Conceptor-CAM has been validated with both formal verifications and experiments on the dataset of the largest scale in literature. The experimental results show that Conceptor-CAM is compatible with and can bring significant improvement to all well recognized CAM-based methods, and has outperformed the state-of-the-art methods by 43.14%~72.79% (88.39%~168.15%) on ILSVRC2012 in Average Increase (Drop), 15.42%~42.55% (47.09%~372.09%) on VOC, and 17.43%~31.32% (47.54%~206.45%) on COCO, respectively.


"Thy algorithm shalt not bear false witness": An Evaluation of Multiclass Debiasing Methods on Word Embeddings

Schlender, Thalea, Spanakis, Gerasimos

arXiv.org Artificial Intelligence

With the vast development and employment of artificial intelligence applications, research into the fairness of these algorithms has been increased. Specifically, in the natural language processing domain, it has been shown that social biases persist in word embeddings and are thus in danger of amplifying these biases when used. As an example of social bias, religious biases are shown to persist in word embeddings and the need for its removal is highlighted. This paper investigates the state-of-the-art multiclass debiasing techniques: Hard debiasing, SoftWEAT debiasing and Conceptor debiasing. It evaluates their performance when removing religious bias on a common basis by quantifying bias removal via the Word Embedding Association Test (WEAT), Mean Average Cosine Similarity (MAC) and the Relative Negative Sentiment Bias (RNSB). By investigating the religious bias removal on three widely used word embeddings, namely: Word2Vec, GloVe, and ConceptNet, it is shown that the preferred method is ConceptorDebiasing. Specifically, this technique manages to decrease the measured religious bias on average by 82,42%, 96,78% and 54,76% for the three word embedding sets respectively.


Transfer between long-term and short-term memory using Conceptors

Strock, Anthony, Rougier, Nicolas, Hinaut, Xavier

arXiv.org Machine Learning

The reservoir computing (RC) paradigm [9] is a peculiar and economic way to train a recurrent neural network (RNN) because only the output layer is modified while the input and recurrent layers are kept unmodified. Such RNNs are called reservoirs because they provide a pool of nonlinear computations based on inputs. Many variants (such as Echo State Networks [8] and Liquid State Machine [15]), along with specific extensions of this RC paradigm have been proposed since its initial stance by [8] (for a review see [14]), including implementations in various hardware like DNAor laser-based ones (see [25] for a recent review on physical reservoirs). A recent and major enhancement of the RC paradigm has been proposed by Jaeger [10], called Conceptors (see Figure 1 that introduces the main concepts). Intuitively, a conceptor represents a subspace of internal states of a RNN, e.g. the trajectory of a reservoir when fed by some input.


Recognizing Human Internal States: A Conceptor-Based Approach

Bartlett, Madeleine, Garcia, Daniel Hernandez, Thill, Serge, Belpaeme, Tony

arXiv.org Machine Learning

--The past few decades has seen increased interest in the application of social robots to interventions for Autism Spectrum Disorder as behavioural coaches [4]. We consider that robots embedded in therapies could also provide quantitative diagnostic information by observing patient behaviours. The social nature of ASD symptoms means that, to achieve this, robots need to be able to recognize the internal states their human interaction partners are experiencing, e.g. In this paper we discuss these two questions in depth and propose a novel, conceptor-based classifier . We report the initial results of this system in a proof-of-concept study and outline plans for future work. The development of socially interactive robots has inspired research into various applications for these tools.


Continual Learning for Sentence Representations Using Conceptors

Liu, Tianlin, Ungar, Lyle, Sedoc, João

arXiv.org Machine Learning

Distributed representations of sentences have become ubiquitous in natural language processing tasks. In this paper, we consider a continual learning scenario for sentence representations: Given a sequence of corpora, we aim to optimize the sentence encoder with respect to the new corpus while maintaining its accuracy on the old corpora. To address this problem, we propose to initialize sentence encoders with the help of corpus-independent features, and then sequentially update sentence encoders using Boolean operations of conceptor matrices to learn corpus-dependent features. We evaluate our approach on semantic textual similarity tasks and show that our proposed sentence encoder can continually learn features from new corpora while retaining its competence on previously encountered corpora.


Unsupervised Post-processing of Word Vectors via Conceptor Negation

Liu, Tianlin, Ungar, Lyle, Sedoc, João

arXiv.org Machine Learning

Word vectors are at the core of many natural language processing tasks. Recently, there has been interest in post-processing word vectors to enrich their semantic information. In this paper, we introduce a novel word vector post-processing technique based on matrix conceptors (Jaeger2014), a family of regularized identity maps. More concretely, we propose to use conceptors to suppress those latent features of word vectors having high variances. The proposed method is purely unsupervised: it does not rely on any corpus or external linguistic database. We evaluate the post-processed word vectors on a battery of intrinsic lexical evaluation tasks, showing that the proposed method consistently outperforms existing state-of-the-art alternatives. We also show that post-processed word vectors can be used for the downstream natural language processing task of dialogue state tracking, yielding improved results in different dialogue domains.


Correcting the Common Discourse Bias in Linear Representation of Sentences using Conceptors

Liu, Tianlin, Sedoc, João, Ungar, Lyle

arXiv.org Machine Learning

Distributed representations of words, better known as word embeddings, have become important building blocks for natural language processing tasks. Numerous studies are devoted to transferring the success of unsupervised word embeddings to sentence embeddings. In this paper, we introduce a simple representation of sentences in which a sentence embedding is represented as a weighted average of word vectors followed by a soft projection. We demonstrate the effectiveness of this proposed method on the clinical semantic textual similarity task of the BioCreative/OHNLP Challenge 2018.