AITopics | anchor pair

Collaborating Authors

anchor pair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f8e55d98b0c2569bd0aa25b076e6b3f8-Paper-Conference.pdf

Neural Information Processing SystemsJun-23-2026, 03:37:48 GMT

State Space Models (SSMs) have emerged as promising alternatives to attention mechanisms, with the Mamba architecture demonstrating impressive performance and linear complexity for processing long sequences. However, the fundamental differences between Mamba and Transformer architectures remain incompletely understood. In this work, we use carefully designed synthetic tasks to reveal Mamba's inherent limitations. Through experiments, we identify that Mamba's nonlinear convolution introduces an asymmetry bias that significantly impairs its ability to recognize symmetrical patterns and relationships. Using composite function and inverse sequence matching tasks, we demonstrate that Mamba strongly favors compositional solutions over symmetrical ones and struggles with tasks requiring the matching of reversed sequences. We show these limitations stem not from the SSM module itself but from the nonlinear convolution preceding it, which fuses token information asymmetrically. These insights provide a new understanding of Mamba's constraints and suggest concrete architectural improvements for future sequence models.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

19c145aaad40927c51f4d10eaa339c20-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 17:24:40 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing

Neural Information Processing SystemsOct-9-2025, 19:54:35 GMT

Transformers have shown impressive capabilities across various tasks, but their performance on compositional problems remains a topic of debate. In this work, we investigate the mechanisms of how transformers behave on unseen compositional tasks. We discover that the parameter initialization scale plays a critical role in determining whether the model learns inferential (reasoning-based) solutions, which capture the underlying compositional primitives, or symmetric (memory-based) solutions, which simply memorize mappings without understanding the compositional structure. By analyzing the information flow and vector representations within the model, we reveal the distinct mechanisms underlying these solution types. We further find that inferential (reasoning-based) solutions exhibit low complexity bias, which we hypothesize is a key factor enabling them to learn individual mappings for single anchors.

anchor pair, initialization scale, mapping, (12 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers

Zhang, Zhongwang, Lin, Pengxiao, Wang, Zhiwei, Zhang, Yaoyu, Xu, Zhi-Qin John

arXiv.org Artificial IntelligenceJan-14-2025

Transformers have demonstrated impressive capabilities across various tasks, yet their performance on compositional problems remains a subject of debate. In this study, we investigate the internal mechanisms underlying Transformers' behavior in compositional tasks. We find that complexity control strategies significantly influence whether the model learns primitive-level rules that generalize out-of-distribution (reasoning-based solutions) or relies solely on memorized mappings (memory-based solutions). By applying masking strategies to the model's information circuits and employing multiple complexity metrics, we reveal distinct internal working mechanisms associated with different solution types. Further analysis reveals that reasoning-based solutions exhibit a lower complexity bias, which aligns with the well-studied neuron condensation phenomenon. This lower complexity bias is hypothesized to be the key factor enabling these solutions to learn reasoning rules. We validate these conclusions across multiple real-world datasets, including image generation and natural language processing tasks, confirming the broad applicability of our findings.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.08537

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

Zhang, Zhongwang, Lin, Pengxiao, Wang, Zhiwei, Zhang, Yaoyu, Xu, Zhi-Qin John

arXiv.org Artificial IntelligenceMay-24-2024

Transformers have shown impressive capabilities across various tasks, but their performance on compositional problems remains a topic of debate. In this work, we investigate the mechanisms of how transformers behave on unseen compositional tasks. We discover that the parameter initialization scale plays a critical role in determining whether the model learns inferential solutions, which capture the underlying compositional primitives, or symmetric solutions, which simply memorize mappings without understanding the compositional structure. By analyzing the information flow and vector representations within the model, we reveal the distinct mechanisms underlying these solution types. We further find that inferential solutions exhibit low complexity bias, which we hypothesize is a key factor enabling them to learn individual mappings for single anchors. Building upon the understanding of these mechanisms, we can predict the learning behavior of models with different initialization scales when faced with data of varying complexity. Our findings provide valuable insights into the role of initialization scale in shaping the type of solution learned by transformers and their ability to learn and generalize compositional tasks.

anchor pair, initialization scale, mapping, (13 more...)

arXiv.org Artificial Intelligence

2405.05409

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Adaptive Anchor Pairs Selection in a TDOA-based System Through Robot Localization Error Minimization

Kolakowski, Marcin

arXiv.org Artificial IntelligenceApr-7-2024

The following paper presents an adaptive anchor pairs selection method for ultra-wideband (UWB) Time Difference of Arrival (TDOA) based positioning systems. The method divides the area covered by the system into several zones and assigns them anchor pair sets. The pair sets are determined during calibration based on localization root mean square error (RMSE). The calibration assumes driving a mobile platform equipped with a LiDAR sensor and a UWB tag through the specified zones. The robot is localized separately based on a large set of different TDOA pairs and using a LiDAR, which acts as the reference. For each zone, the TDOA pairs set for which the registered RMSE is lowest is selected and used for localization in the routine system work. The proposed method has been tested with simulations and experiments. The results for both simulated static and experimental dynamic scenarios have proven that the adaptive selection of the anchor nodes leads to an increase in localization accuracy. In the experiment, the median trajectory error for a moving person localization was at a level of 25 cm.

algorithm, anchor pair, tdoa pair, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SPSympo51155.2020.9593477

2404.05067

Country:

Europe > Poland > Masovia Province > Warsaw (0.05)
Europe > Poland > Łódź Province > Łódź (0.05)
Europe > Serbia > Central Serbia > Belgrade (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (0.89)

Add feedback

Deep Discriminative Latent Space for Clustering

Tzoreff, Elad, Kogan, Olga, Choukroun, Yoni

arXiv.org Artificial IntelligenceMay-28-2018

Clustering is one of the most fundamental tasks in data analysis and machine learning. It is central to many data-driven applications that aim to separate the data into groups with similar patterns. Moreover, clustering is a complex procedure that is affected significantly by the choice of the data representation method. Recent research has demonstrated encouraging clustering results by learning effectively these representations. In most of these works a deep auto-encoder is initially pre-trained to minimize a reconstruction loss, and then jointly optimized with clustering centroids in order to improve the clustering objective. Those works focus mainly on the clustering phase of the procedure, while not utilizing the potential benefit out of the initial phase. In this paper we propose to optimize an auto-encoder with respect to a discriminative pairwise loss function during the auto-encoder pre-training phase. We demonstrate the high accuracy obtained by the proposed method as well as its rapid convergence (e.g.

artificial intelligence, machine learning, similarity, (17 more...)

arXiv.org Artificial Intelligence

1805.10795

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback