Goto

Collaborating Authors

 vector dimension


Wave-Based Semantic Memory with Resonance-Based Retrieval: A Phase-Aware Alternative to Vector Embedding Stores

Listopad, Aleksandr

arXiv.org Artificial Intelligence

Conventional vector-based memory systems rely on cosine or inner product similarity within real-valued embedding spaces. While computationally efficient, such approaches are inherently phase-insensitive and limited in their ability to capture resonance phenomena crucial for meaning representation. We propose Wave-Based Semantic Memory, a novel framework that models knowledge as wave patterns $ψ(x) = A(x) e^{iϕ(x)}$ and retrieves it through resonance-based interference. This approach preserves both amplitude and phase information, enabling more expressive and robust semantic similarity. We demonstrate that resonance-based retrieval achieves higher discriminative power in cases where vector methods fail, including phase shifts, negations, and compositional queries. Our implementation, ResonanceDB, shows scalability to millions of patterns with millisecond latency, positioning wave-based memory as a viable alternative to vector stores for AGI-oriented reasoning and knowledge representation.


Towards Vector Optimization on Low-Dimensional Vector Symbolic Architecture

Duan, Shijin, Liu, Yejia, Liu, Gaowen, Kompella, Ramana Rao, Ren, Shaolei, Xu, Xiaolin

arXiv.org Artificial Intelligence

Vector Symbolic Architecture (VSA) is emerging in machine learning due to its efficiency, but they are hindered by issues of hyperdimensionality and accuracy. As a promising mitigation, the Low-Dimensional Computing (LDC) method significantly reduces the vector dimension by ~100 times while maintaining accuracy, by employing a gradient-based optimization. Despite its potential, LDC optimization for VSA is still underexplored. Our investigation into vector updates underscores the importance of stable, adaptive dynamics in LDC training. We also reveal the overlooked yet critical roles of batch normalization (BN) and knowledge distillation (KD) in standard approaches. Besides the accuracy boost, BN does not add computational overhead during inference, and KD significantly enhances inference confidence. Through extensive experiments and ablation studies across multiple benchmarks, we provide a thorough evaluation of our approach and extend the interpretability of binary neural network optimization similar to LDC, previously unaddressed in BNN literature.


A First Look At Efficient And Secure On-Device LLM Inference Against KV Leakage

Yang, Huan, Zhang, Deyu, Zhao, Yudong, Li, Yuanchun, Liu, Yunxin

arXiv.org Artificial Intelligence

Running LLMs on end devices has garnered significant attention recently due to their advantages in privacy preservation. With the advent of lightweight LLM models and specially designed GPUs, on-device LLM inference has achieved the necessary accuracy and performance metrics. However, we have identified that LLM inference on GPUs can leak privacy-sensitive intermediate information, specifically the KV pairs. An attacker could exploit these KV pairs to reconstruct the entire user conversation, leading to significant vulnerabilities. Existing solutions, such as Fully Homomorphic Encryption (FHE) and Trusted Execution Environments (TEE), are either too computation-intensive or resource-limited. To address these issues, we designed KV-Shield, which operates in two phases. In the initialization phase, it permutes the weight matrices so that all KV pairs are correspondingly permuted. During the runtime phase, the attention vector is inversely permuted to ensure the correctness of the layer output. All permutation-related operations are executed within the TEE, ensuring that insecure GPUs cannot access the original KV pairs, thus preventing conversation reconstruction. Finally, we theoretically analyze the correctness of KV-Shield, along with its advantages and overhead.


ANALOGICAL -- A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models

Wijesiriwardene, Thilini, Wickramarachchi, Ruwan, Gajera, Bimal G., Gowaikar, Shreeyash Mukul, Gupta, Chandan, Chadha, Aman, Reganti, Aishwarya Naresh, Sheth, Amit, Das, Amitava

arXiv.org Artificial Intelligence

Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity -- (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.


Alleviating Over-smoothing for Unsupervised Sentence Representation

Chen, Nuo, Shou, Linjun, Gong, Ming, Pei, Jian, Cao, Bowen, Chang, Jianhui, Jiang, Daxin, Li, Jia

arXiv.org Artificial Intelligence

Currently, learning better unsupervised sentence representations is the pursuit of many natural language processing communities. Lots of approaches based on pre-trained language models (PLMs) and contrastive learning have achieved promising results on this task. Experimentally, we observe that the over-smoothing problem reduces the capacity of these powerful PLMs, leading to sub-optimal sentence representations. In this paper, we present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue, which samples negatives from PLMs intermediate layers, improving the quality of the sentence representation. Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting, which can be seen as a plug-and-play contrastive framework for learning unsupervised sentence representation. Extensive results prove that SSCL brings the superior performance improvements of different strong baselines (e.g., BERT and SimCSE) on Semantic Textual Similarity and Transfer datasets. Our codes are available at https://github.com/nuochenpku/SSCL.


The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations

Duncan, Jeremiah, Fallas, Fabian, Gropp, Chris, Herron, Emily, Mahbub, Maria, Olaya, Paula, Ponce, Eduardo, Samuel, Tabitha K., Schultz, Daniel, Srinivasan, Sudarshan, Tang, Maofeng, Zenkov, Viktor, Zhou, Quan, Begoli, Edmon

arXiv.org Artificial Intelligence

Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sources of misinformation, etc. The focus of this paper is to explore the limitations and sensitiveness of established approaches to adversarial manipulations of inputs. To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations. Next, we experimentally evaluated the performance of the authorship detection model to a collection of semantic-preserving adversarial perturbations of input narratives. Finally, we compare and analyze the effects of different perturbation strategies, input and model configurations, and the effects of these on the author detection model.


Deep Collective Learning: Learning Optimal Inputs and Weights Jointly in Deep Neural Networks

Deng, Xiang, Zhongfei, null, Zhang, null

arXiv.org Machine Learning

It is well observed that in deep learning and computer vision literature, visual data are always represented in a manually designed coding scheme (eg., RGB images are represented as integers ranging from 0 to 255 for each channel) when they are input to an end-to-end deep neural network (DNN) for any learning task. We boldly question whether the manually designed inputs are good for DNN training for different tasks and study whether the input to a DNN can be optimally learned end-to-end together with learning the weights of the DNN. In this paper, we propose the paradigm of {\em deep collective learning} which aims to learn the weights of DNNs and the inputs to DNNs simultaneously for given tasks. We note that collective learning has been implicitly but widely used in natural language processing while it has almost never been studied in computer vision. Consequently, we propose the lookup vision networks (Lookup-VNets) as a solution to deep collective learning in computer vision. This is achieved by associating each color in each channel with a vector in lookup tables. As learning inputs in computer vision has almost never been studied in the existing literature, we explore several aspects of this question through varieties of experiments on image classification tasks. Experimental results on four benchmark datasets, i.e., CIFAR-10, CIFAR-100, Tiny ImageNet, and ImageNet (ILSVRC2012) have shown several surprising characteristics of Lookup-VNets and have demonstrated the advantages and promise of Lookup-VNets and deep collective learning.


SENSE: Semantically Enhanced Node Sequence Embedding

Rallapalli, Swati, Ma, Liang, Srivatsa, Mudhakar, Swami, Ananthram, Kwon, Heesung, Bent, Graham, Simpkin, Christopher

arXiv.org Machine Learning

Effectively capturing graph node sequences in the form of vector embeddings is critical to many applications. We achieve this by (i) first learning vector embeddings of single graph nodes and (ii) then composing them to compactly represent node sequences. Specifically, we propose SENSE-S (Semantically Enhanced Node Sequence Embedding - for Single nodes), a skip-gram based novel embedding mechanism, for single graph nodes that co-learns graph structure as well as their textual descriptions. We demonstrate that SENSE-S vectors increase the accuracy of multi-label classification tasks by up to 50% and link-prediction tasks by up to 78% under a variety of scenarios using real datasets. Based on SENSE-S, we next propose generic SENSE to compute composite vectors that represent a sequence of nodes, where preserving the node order is important. We prove that this approach is efficient in embedding node sequences, and our experiments on real data confirm its high accuracy in node order decoding.


Representing Sets as Summed Semantic Vectors

Summers-Stay, Douglas, Sutor, Peter, Li, Dandan

arXiv.org Artificial Intelligence

Representing meaning in the form of high dimensional vectors is a common and powerful tool in biologically inspired architectures. While the meaning of a set of concepts can be summarized by taking a (possibly weighted) sum of their associated vectors, this has generally been treated as a one-way operation. In this paper we show how a technique built to aid sparse vector decomposition allows in many cases the exact recovery of the inputs and weights to such a sum, allowing a single vector to represent an entire set of vectors from a dictionary. We characterize the number of vectors that can be recovered under various conditions, and explore several ways such a tool can be used for vector-based reasoning.