Goto

Collaborating Authors

 siamese architecture


An Efficient deep learning model to Predict Stock Price Movement Based on Limit Order Book

arXiv.org Artificial Intelligence

In high-frequency trading (HFT), leveraging limit order books (LOB) to model stock price movements is crucial for achieving profitable outcomes. However, this task is challenging due to the high-dimensional and volatile nature of the original data. Even recent deep learning models often struggle to capture price movement patterns effectively, particularly without well-designed features. We observed that raw LOB data exhibits inherent symmetry between the ask and bid sides, and the bid-ask differences demonstrate greater stability and lower complexity compared to the original data. Building on this insight, we propose a novel approach in which leverages the Siamese architecture to enhance the performance of existing deep learning models. The core idea involves processing the ask and bid sides separately using the same module with shared parameters. We applied our Siamese-based methods to several widely used strong baselines and validated their effectiveness using data from 14 military industry stocks in the Chinese A-share market. Furthermore, we integrated multi-head attention (MHA) mechanisms with the Long Short-Term Memory (LSTM) module to investigate its role in modeling stock price movements. Our experiments used raw data and widely used Order Flow Imbalance (OFI) features as input with some strong baseline models. The results show that our method improves the performance of strong baselines in over 75$% of cases, excluding the Multi-Layer Perception (MLP) baseline, which performed poorly and is not considered practical. Furthermore, we found that Multi-Head Attention can enhance model performance, particularly over shorter forecasting horizons.


Reviews: PRUNE: Preserving Proximity and Global Ranking for Network Embedding

Neural Information Processing Systems

The paper presents a NN model for learning graph embeddings that preserves the local graph structure and a global node ranking similar to PageRank. The model is based on a Siamese network, which takes as inputs two node embeddings and compute a new (output) representation for each node using the Siamese architecture. Learning is unsupervised in the sense that it makes use only of the graph structure. Some links with a community detection criterion are also discussed. The model is evaluated on a series of tasks: node ranking, classification and regression, link prediction, and compared to other families of unsupervised embedding learning methods.


TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability

arXiv.org Artificial Intelligence

The generalization of the end-to-end deep reinforcement learning (DRL) for object-goal visual navigation is a long-standing challenge since object classes and placements vary in new test environments. Learning domain-independent visual representation is critical for enabling the trained DRL agent with the ability to generalize to unseen scenes and objects. In this letter, a target-directed attention network (TDANet) is proposed to learn the end-to-end object-goal visual navigation policy with zero-shot ability. TDANet features a novel target attention (TA) module that learns both the spatial and semantic relationships among objects to help TDANet focus on the most relevant observed objects to the target. With the Siamese architecture (SA) design, TDANet distinguishes the difference between the current and target states and generates the domain-independent visual representation. To evaluate the navigation performance of TDANet, extensive experiments are conducted in the AI2-THOR embodied AI environment. The simulation results demonstrate a strong generalization ability of TDANet to unseen scenes and target objects, with higher navigation success rate (SR) and success weighted by length (SPL) than other state-of-the-art models.


LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

arXiv.org Artificial Intelligence

Text embeddings are useful features for several NLP applications, such as sentence similarity, text clustering, and semantic search. In this paper, we present a Low-rank Adaptation with a Contrastive objective on top of 8-bit Siamese-BLOOM, a multilingual large language model optimized to produce semantically meaningful word embeddings. The innovation is threefold. First, we cast BLOOM weights to 8-bit values. Second, we fine-tune BLOOM with a scalable adapter (LoRA) and 8-bit Adam optimizer for sentence similarity classification. Third, we apply a Siamese architecture on BLOOM model with a contrastive objective to ease the multi-lingual labeled data scarcity. The experiment results show the quality of learned embeddings from LACoS-BLOOM is proportional to the number of model parameters and the amount of unlabeled training data. With the parameter efficient fine-tuning design, we are able to run BLOOM 7.1 billion parameters end-to-end on a single GPU machine with 32GB memory. Compared to previous solution Sentence-BERT, we achieve significant improvement on both English and multi-lingual STS tasks.


Zero-shot Medical Entity Retrieval without Annotation: Learning From Rich Knowledge Graph Semantics

arXiv.org Artificial Intelligence

Medical entity retrieval is an integral component for understanding and communicating information across various health systems. Current approaches tend to work well on specific medical domains but generalize poorly to unseen sub-specialties. This is of increasing concern under a public health crisis as new medical conditions and drug treatments come to light frequently. Zero-shot retrieval is challenging due to the high degree of ambiguity and variability in medical corpora, making it difficult to build an accurate similarity measure between mentions and concepts. Medical knowledge graphs (KG), however, contain rich semantics including large numbers of synonyms as well as its curated graphical structures. To take advantage of this valuable information, we propose a suite of learning tasks designed for training efficient zero-shot entity retrieval models. Without requiring any human annotation, our knowledge graph enriched architecture significantly outperforms common zero-shot benchmarks including BM25 and Clinical BERT with 7% to 30% higher recall across multiple major medical ontologies, such as UMLS, SNOMED, and ICD-10.


Predicting Drug-Drug Interactions from Molecular Structure Images

arXiv.org Machine Learning

Adverse drug events (ADEs) are "injuries resulting from medical intervention related to a drug" (Nebeker, Barach, and Samore 2004), and are distinct from medication errors (inappropriate prescription, dispensing, usage etc.) as they are caused by drugs at normal dosages. According to the National Center for Health Statistics (NCHS 2014), 48.9% of Americans took at least one prescription drug in the last 30 days, 23.1% took at least three, and 11.9% took at least


NeuralWarp: Time-Series Similarity with Warping Networks

arXiv.org Machine Learning

Research on time-series similarity measures has emphasized the need for elastic methods which align the indices of pairs of time series and a plethora of non-parametric have been proposed for the task. On the other hand, deep learning approaches are dominant in closely related domains, such as learning image and text sentence similarity. In this paper, we propose \textit{NeuralWarp}, a novel measure that models the alignment of time-series indices in a deep representation space, by modeling a warping function as an upper level neural network between deeply-encoded time series values. Experimental results demonstrate that \textit{NeuralWarp} outperforms both non-parametric and un-warped deep models on a range of diverse real-life datasets.


REGMAPR - Text Matching Made Easy

arXiv.org Artificial Intelligence

We propose a simple model for textual matching problems. Starting from a Siamese architecture, we augment word embeddings with two features based on exact and paraphrase match between words in the two sentences being considered. We train the model using four types of regularization on datasets for textual entailment, paraphrase detection and semantic relatedness. Our model performs comparably or better than more complex architectures; achieving state-of-the-art results for paraphrase detection on the SICK dataset and for textual entailment on the SNLI dataset.


Local Feature Descriptor Learning with Adaptive Siamese Network

arXiv.org Machine Learning

Although the recent progress in the deep neural network has led to the development of learnable local feature descriptors, there is no explicit answer for estimation of the necessary size of a neural network. Specifically, the local feature is represented in a low dimensional space, so the neural network should have more compact structure. The small networks required for local feature descriptor learning may be sensitive to initial conditions and learning parameters and more likely to become trapped in local minima. In order to address the above problem, we introduce an adaptive pruning Siamese Architecture based on neuron activation to learn local feature descriptors, making the network more computationally efficient with an improved recognition rate over more complex networks. Our experiments demonstrate that our learned local feature descriptors outperform the state-of-art methods in patch matching.