AITopics | siamese architecture

Collaborating Authors

siamese architecture

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Efficient deep learning model to Predict Stock Price Movement Based on Limit Order Book

Yang, Jiahao, Fang, Ran, Zhang, Ming, Zhou, Jun

arXiv.org Artificial IntelligenceMay-30-2025

In high-frequency trading (HFT), leveraging limit order books (LOB) to model stock price movements is crucial for achieving profitable outcomes. However, this task is challenging due to the high-dimensional and volatile nature of the original data. Even recent deep learning models often struggle to capture price movement patterns effectively, particularly without well-designed features. We observed that raw LOB data exhibits inherent symmetry between the ask and bid sides, and the bid-ask differences demonstrate greater stability and lower complexity compared to the original data. Building on this insight, we propose a novel approach in which leverages the Siamese architecture to enhance the performance of existing deep learning models. The core idea involves processing the ask and bid sides separately using the same module with shared parameters. We applied our Siamese-based methods to several widely used strong baselines and validated their effectiveness using data from 14 military industry stocks in the Chinese A-share market. Furthermore, we integrated multi-head attention (MHA) mechanisms with the Long Short-Term Memory (LSTM) module to investigate its role in modeling stock price movements. Our experiments used raw data and widely used Order Flow Imbalance (OFI) features as input with some strong baseline models. The results show that our method improves the performance of strong baselines in over 75$% of cases, excluding the Multi-Layer Perception (MLP) baseline, which performed poorly and is not considered practical. Furthermore, we found that Multi-Head Attention can enhance model performance, particularly over shorter forecasting horizons.

artificial intelligence, lob data, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.22678

Country:

Europe (0.46)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: PRUNE: Preserving Proximity and Global Ranking for Network Embedding

Neural Information Processing SystemsOct-8-2024, 09:27:12 GMT

The paper presents a NN model for learning graph embeddings that preserves the local graph structure and a global node ranking similar to PageRank. The model is based on a Siamese network, which takes as inputs two node embeddings and compute a new (output) representation for each node using the Siamese architecture. Learning is unsupervised in the sense that it makes use only of the graph structure. Some links with a community detection criterion are also discussed. The model is evaluated on a series of tasks: node ranking, classification and regression, link prediction, and compared to other families of unsupervised embedding learning methods.

network embedding, preserving proximity and global ranking, representation, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining (0.61)
Information Technology > Information Management > Search (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.59)

Add feedback

TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability

Lian, Shiwei, Zhang, Feitian

arXiv.org Artificial IntelligenceApr-12-2024

The generalization of the end-to-end deep reinforcement learning (DRL) for object-goal visual navigation is a long-standing challenge since object classes and placements vary in new test environments. Learning domain-independent visual representation is critical for enabling the trained DRL agent with the ability to generalize to unseen scenes and objects. In this letter, a target-directed attention network (TDANet) is proposed to learn the end-to-end object-goal visual navigation policy with zero-shot ability. TDANet features a novel target attention (TA) module that learns both the spatial and semantic relationships among objects to help TDANet focus on the most relevant observed objects to the target. With the Siamese architecture (SA) design, TDANet distinguishes the difference between the current and target states and generates the domain-independent visual representation. To evaluate the navigation performance of TDANet, extensive experiments are conducted in the AI2-THOR embodied AI environment. The simulation results demonstrate a strong generalization ability of TDANet to unseen scenes and target objects, with higher navigation success rate (SR) and success weighted by length (SPL) than other state-of-the-art models.

navigation, tdanet, visual navigation, (15 more...)

arXiv.org Artificial Intelligence

2404.08353

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

Hua, Wen-Yu, Williams, Brian, Shamsi, Davood

arXiv.org Artificial IntelligenceMay-10-2023

Text embeddings are useful features for several NLP applications, such as sentence similarity, text clustering, and semantic search. In this paper, we present a Low-rank Adaptation with a Contrastive objective on top of 8-bit Siamese-BLOOM, a multilingual large language model optimized to produce semantically meaningful word embeddings. The innovation is threefold. First, we cast BLOOM weights to 8-bit values. Second, we fine-tune BLOOM with a scalable adapter (LoRA) and 8-bit Adam optimizer for sentence similarity classification. Third, we apply a Siamese architecture on BLOOM model with a contrastive objective to ease the multi-lingual labeled data scarcity. The experiment results show the quality of learned embeddings from LACoS-BLOOM is proportional to the number of model parameters and the amount of unlabeled training data. With the parameter efficient fine-tuning design, we are able to run BLOOM 7.1 billion parameters end-to-end on a single GPU machine with 32GB memory. Compared to previous solution Sentence-BERT, we achieve significant improvement on both English and multi-lingual STS tasks.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.06404

Country:

North America > United States (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Zero-shot Medical Entity Retrieval without Annotation: Learning From Rich Knowledge Graph Semantics

Kong, Luyang, Winestock, Christopher, Bhatia, Parminder

arXiv.org Artificial IntelligenceMay-26-2021

Medical entity retrieval is an integral component for understanding and communicating information across various health systems. Current approaches tend to work well on specific medical domains but generalize poorly to unseen sub-specialties. This is of increasing concern under a public health crisis as new medical conditions and drug treatments come to light frequently. Zero-shot retrieval is challenging due to the high degree of ambiguity and variability in medical corpora, making it difficult to build an accurate similarity measure between mentions and concepts. Medical knowledge graphs (KG), however, contain rich semantics including large numbers of synonyms as well as its curated graphical structures. To take advantage of this valuable information, we propose a suite of learning tasks designed for training efficient zero-shot entity retrieval models. Without requiring any human annotation, our knowledge graph enriched architecture significantly outperforms common zero-shot benchmarks including BM25 and Clinical BERT with 7% to 30% higher recall across multiple major medical ontologies, such as UMLS, SNOMED, and ICD-10.

annotation, entity retrieval, retrieval, (11 more...)

arXiv.org Artificial Intelligence

2105.12682

Country: Oceania > Australia > Victoria > Melbourne (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Health Care Providers & Services (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.55)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.74)

Add feedback

Predicting Drug-Drug Interactions from Molecular Structure Images

Dhami, Devendra Singh, Kunapuli, Gautam, Page, David, Natarajan, Sriraam

arXiv.org Machine LearningNov-14-2019

Adverse drug events (ADEs) are "injuries resulting from medical intervention related to a drug" (Nebeker, Barach, and Samore 2004), and are distinct from medication errors (inappropriate prescription, dispensing, usage etc.) as they are caused by drugs at normal dosages. According to the National Center for Health Statistics (NCHS 2014), 48.9% of Americans took at least one prescription drug in the last 30 days, 23.1% took at least three, and 11.9% took at least

drug-drug interaction, interaction, siamese architecture, (14 more...)

arXiv.org Machine Learning

1911.06356

Country:

North America > United States > Texas > Dallas County > Dallas (0.05)
Europe (0.04)

Genre: Research Report > Experimental Study (0.89)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

NeuralWarp: Time-Series Similarity with Warping Networks

Grabocka, Josif, Schmidt-Thieme, Lars

arXiv.org Machine LearningDec-19-2018

Research on time-series similarity measures has emphasized the need for elastic methods which align the indices of pairs of time series and a plethora of non-parametric have been proposed for the task. On the other hand, deep learning approaches are dominant in closely related domains, such as learning image and text sentence similarity. In this paper, we propose \textit{NeuralWarp}, a novel measure that models the alignment of time-series indices in a deep representation space, by modeling a warping function as an upper level neural network between deeply-encoded time series values. Experimental results demonstrate that \textit{NeuralWarp} outperforms both non-parametric and un-warped deep models on a range of diverse real-life datasets.

dataset, neuralwarp, similarity measure, (13 more...)

arXiv.org Machine Learning

1812.08306

Country:

Europe > Germany (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.05)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

REGMAPR - Text Matching Made Easy

Brahma, Siddhartha

arXiv.org Artificial IntelligenceAug-28-2018

We propose a simple model for textual matching problems. Starting from a Siamese architecture, we augment word embeddings with two features based on exact and paraphrase match between words in the two sentences being considered. We train the model using four types of regularization on datasets for textual entailment, paraphrase detection and semantic relatedness. Our model performs comparably or better than more complex architectures; achieving state-of-the-art results for paraphrase detection on the SICK dataset and for textual entailment on the SNLI dataset.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1808.04343

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Local Feature Descriptor Learning with Adaptive Siamese Network

Huang, Chong, Liu, Qiong, Chen, Yan-Ying, Kwang-Ting, null, Cheng, null

arXiv.org Machine LearningJun-16-2017

Although the recent progress in the deep neural network has led to the development of learnable local feature descriptors, there is no explicit answer for estimation of the necessary size of a neural network. Specifically, the local feature is represented in a low dimensional space, so the neural network should have more compact structure. The small networks required for local feature descriptor learning may be sensitive to initial conditions and learning parameters and more likely to become trapped in local minima. In order to address the above problem, we introduce an adaptive pruning Siamese Architecture based on neuron activation to learn local feature descriptors, making the network more computationally efficient with an improved recognition rate over more complex networks. Our experiments demonstrate that our learned local feature descriptors outperform the state-of-art methods in patch matching.

artificial intelligence, feature descriptor, machine learning, (12 more...)

arXiv.org Machine Learning

1706.05358

Country: North America > United States > California (0.29)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback