AITopics | Bojkovic, Velibor

Plotting

Bojkovic, Velibor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Number Representations in LLMs: A Computational Parallel to Human Perception

AlquBoj, H. V., AlQuabeh, Hilal, Bojkovic, Velibor, Hiraoka, Tatsuya, El-Shangiti, Ahmed Oumar, Nwadike, Munachiso, Inui, Kentaro

arXiv.org Artificial IntelligenceFeb-22-2025

Humans are believed to perceive numbers on a logarithmic mental number line, where smaller values are represented with greater resolution than larger ones. This cognitive bias, supported by neuroscience and behavioral studies, suggests that numerical magnitudes are processed in a sublinear fashion rather than on a uniform linear scale. Inspired by this hypothesis, we investigate whether large language models (LLMs) exhibit a similar logarithmic-like structure in their internal numerical representations. By analyzing how numerical values are encoded across different layers of LLMs, we apply dimensionality reduction techniques such as PCA and PLS followed by geometric regression to uncover latent structures in the learned embeddings. Our findings reveal that the model's numerical representations exhibit sublinear spacing, with distances between values aligning with a logarithmic scale. This suggests that LLMs, much like humans, may encode numbers in a compressed, non-uniform manner.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.16147

Country: Asia > Thailand (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles

Nwadike, Munachiso, Iklassov, Zangir, Aremu, Toluwani, Hiraoka, Tatsuya, Bojkovic, Velibor, Heinzerling, Benjamin, Alqaubeh, Hilal, Takáč, Martin, Inui, Kentaro

arXiv.org Artificial IntelligenceJan-23-2025

We introduce the concept of the self-referencing causal cycle (abbreviated RECALL) - a mechanism that enables large language models (LLMs) to bypass the limitations of unidirectional causality, which underlies a phenomenon known as the reversal curse. When an LLM is prompted with sequential data, it often fails to recall preceding context. For example, when we ask an LLM to recall the line preceding "O say does that star-spangled banner yet wave" in the U.S. National Anthem, it often fails to correctly return "Gave proof through the night that our flag was still there" - this is due to the reversal curse. It occurs because language models such as ChatGPT and Llama generate text based on preceding tokens, requiring facts to be learned and reproduced in a consistent token order. While the reversal curse is often viewed as a limitation, we offer evidence of an alternative view: it is not always an obstacle in practice. We find that RECALL is driven by what we designate as cycle tokens - sequences that connect different parts of the training data, enabling recall of preceding tokens from succeeding ones. Through rigorous probabilistic formalization and controlled experiments, we demonstrate how the cycles they induce influence a model's ability to reproduce information. To facilitate reproducibility, we provide our code and experimental details at https://anonymous.4open.science/r/remember-B0B8/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.13491

Country:

North America > United States (0.48)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Decade of Deep Learning: A Survey on The Magnificent Seven

Azizov, Dilshod, Manzoor, Muhammad Arslan, Bojkovic, Velibor, Wang, Yingxu, Wang, Zixiao, Iklassov, Zangir, Zhao, Kailong, Li, Liang, Liu, Siwei, Zhong, Yu, Liu, Wei, Liang, Shangsong

arXiv.org Artificial IntelligenceDec-13-2024

At the core of this transformation is the development of multi-layered neural network architectures that facilitate automatic feature extraction from raw data, significantly improving the efficiency on machine learning tasks. Given the rapid pace of these advancements, an accessible manual is necessary to distill the key advances of the past decade. With this in mind, we introduce a study which highlights the evolution of deep learning, largely attributed to powerful algorithms. Among the multitude of breakthroughs, certain algorithms, including Residual Networks (ResNets), Transformers, Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Graph Neural Networks (GNNs), Contrastive Language-Image Pretraining (CLIP) and Diffusion models, have emerged as the cornerstones and driving forces behind the discipline. We select these algorithms via a survey targeting a broad spectrum of academics and professionals with the aim of encapsulating the essence of the most influential algorithms over the past decade. In this work, we provide details on the selection methodology, exploring the mentioned architectures in a broader context of the history of deep learning. We present an overview of selected core architectures, their mathematical underpinnings, and the algorithmic procedures that define the subsequent extensions and variants of these models, their applications, and their challenges and potential future research directions. In addition, we explore the practical aspects related to these algorithms, such as training and optimization methods, normalization techniques, and rate scheduling strategies that are essential for their effective implementation. Therefore, our manuscript serves as a practical survey for understanding and applying these crucial algorithms and aims to provide a manual for experienced researchers transitioning into deep learning from other domains, as well as for beginners seeking to grasp the trending algorithms.

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.16188

Country:

Asia (0.28)
Europe (0.28)
North America > United States (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.92)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Leisure & Entertainment > Games (0.67)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

Wu, Xiaofeng, Bojkovic, Velibor, Gu, Bin, Suo, Kun, Zou, Kai

arXiv.org Artificial IntelligenceMar-27-2024

Spiking Neural Networks (SNNs) offer a promising avenue for energy-efficient computing compared with Artificial Neural Networks (ANNs), closely mirroring biological neural processes. However, this potential comes with inherent challenges in directly training SNNs through spatio-temporal backpropagation -- stemming from the temporal dynamics of spiking neurons and their discrete signal processing -- which necessitates alternative ways of training, most notably through ANN-SNN conversion. In this work, we introduce a lightweight Forward Temporal Bias Correction (FTBC) technique, aimed at enhancing conversion accuracy without the computational overhead. We ground our method on provided theoretical findings that through proper temporal bias calibration the expected error of ANN-SNN conversion can be reduced to be zero after each time step. We further propose a heuristic algorithm for finding the temporal bias only in the forward pass, thus eliminating the computational burden of backpropagation and we evaluate our method on CIFAR-10/100 and ImageNet datasets, achieving a notable increase in accuracy on all datasets. Codes are released at a GitHub repository.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2403.18388

Country:

North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Energy Efficient Training of SNN using Local Zeroth Order Method

Mukhoty, Bhaskar, Bojkovic, Velibor, de Vazelhes, William, De Masi, Giulia, Xiong, Huan, Gu, Bin

arXiv.org Artificial IntelligenceFeb-5-2023

Spiking neural networks are becoming increasingly popular for their low energy requirement in real-world tasks with accuracy comparable to the traditional ANNs. SNN training algorithms face the loss of gradient information and non-differentiability due to the Heaviside function in minimizing the model loss over model parameters. To circumvent the problem surrogate method uses a differentiable approximation of the Heaviside in the backward pass, while the forward pass uses the Heaviside as the spiking function. We propose to use the zeroth order technique at the neuron level to resolve this dichotomy and use it within the automatic differentiation tool. As a result, we establish a theoretical connection between the proposed local zeroth-order technique and the existing surrogate methods and vice-versa. The proposed method naturally lends itself to energy-efficient training of SNNs on GPUs. Experimental results with neuromorphic datasets show that such implementation requires less than 1 percent neurons to be active in the backward pass, resulting in a 100x speed-up in the backward computation time. Our method offers better generalization compared to the state-of-the-art energy-efficient technique while maintaining similar efficiency.

artificial intelligence, energy efficient training, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.0091

Country: Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback