AITopics | Alcalde, Albert

Collaborating Authors

Alcalde, Albert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exact Sequence Classification with Hardmax Transformers

Alcalde, Albert, Fantuzzi, Giovanni, Zuazua, Enrique

arXiv.org Machine LearningFeb-4-2025

We prove that hardmax attention transformers perfectly classify datasets of $N$ labeled sequences in $\mathbb{R}^d$, $d\geq 2$. Specifically, given $N$ sequences with an arbitrary but finite length in $\mathbb{R}^d$, we construct a transformer with $\mathcal{O}(N)$ blocks and $\mathcal{O}(Nd)$ parameters perfectly classifying this dataset. Our construction achieves the best complexity estimate to date, independent of the length of the sequences, by innovatively alternating feed-forward and self-attention layers and by capitalizing on the clustering effect inherent to the latter. Our novel constructive method also uses low-rank parameter matrices within the attention mechanism, a common practice in real-life transformer implementations. Consequently, our analysis holds twofold significance: it substantially advances the mathematical theory of transformers and it rigorously justifies their exceptional real-world performance in sequence classification tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2502.0227

Country: Europe > Spain > Basque Country (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Clustering in pure-attention hardmax transformers and its role in sentiment analysis

Alcalde, Albert, Fantuzzi, Giovanni, Zuazua, Enrique

arXiv.org Machine LearningJun-26-2024

Transformers are extremely successful machine learning models whose mathematical properties remain poorly understood. Here, we rigorously characterize the behavior of transformers with hardmax self-attention and normalization sublayers as the number of layers tends to infinity. By viewing such transformers as discrete-time dynamical systems describing the evolution of points in a Euclidean space, and thanks to a geometric interpretation of the self-attention mechanism based on hyperplane separation, we show that the transformer inputs asymptotically converge to a clustered equilibrium determined by special points called leaders. We then leverage this theoretical understanding to solve sentiment analysis problems from language processing using a fully interpretable transformer model, which effectively captures `context' by clustering meaningless words around leader words carrying the most meaning. Finally, we outline remaining challenges to bridge the gap between the mathematical analysis of transformers and their real-life implementation.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2407.01602

Country: Europe > Spain > Basque Country (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback