AITopics | Chertkov, Andrei

Collaborating Authors

Chertkov, Andrei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition

Basharin, Artem, Chertkov, Andrei, Oseledets, Ivan

arXiv.org Artificial IntelligenceOct-23-2024

We propose a new model for multi-token prediction in transformers, aiming to enhance sampling efficiency without compromising accuracy. Motivated by recent work that predicts the probabilities of subsequent tokens using multiple heads, we connect this approach to rank-$1$ canonical tensor decomposition. By generalizing it to a rank-$r$ canonical probability decomposition, we develop an improved model that predicts multiple tokens simultaneously. This model can also be interpreted as a mixture of experts, allowing us to leverage successful techniques from that domain for efficient and robust training. Importantly, the overall overhead for training and sampling remains low. Our method demonstrates significant improvements in inference speed for both text and code generation tasks, proving particularly beneficial within the self-speculative decoding paradigm. It maintains its effectiveness across various model sizes and training epochs, highlighting its robustness and scalability.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.17765

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)

Add feedback

Black-Box Approximation and Optimization with Hierarchical Tucker Decomposition

Ryzhakov, Gleb, Chertkov, Andrei, Basharin, Artem, Oseledets, Ivan

arXiv.org Artificial IntelligenceFeb-5-2024

Storing such a tensor often requires too much computational effort, and for large values of the dimension d, this is We develop a new method HTBB for the multidimensional completely impossible due to the so-called curse of dimensionality black-box approximation and gradientfree (the memory for storing data and the complexity optimization, which is based on the low-rank of working with it grows exponentially in d). To overcome hierarchical Tucker decomposition with the use it, various compression formats for multidimensional tensors of the MaxVol indices selection procedure. Numerical are proposed: Canonical Polyadic decomposition aka experiments for 14 complex model problems CANDECOMP/PARAFAC (CPD) (Harshman et al., 1970), demonstrate the robustness of the proposed Tucker decomposition (Tucker, 1966), Tensor Train (TT) method for dimensions up to 1000, while it shows decomposition (Oseledets, 2011), Hierarchical Tucker (HT) significantly more accurate results than classical decomposition (Hackbusch & Kühn, 2009; Ballani et al., gradient-free optimization methods, as well as 2013), and their various modifications. These approaches approximation and optimization methods based make it possible to approximately represent the tensor in on the popular tensor train decomposition, which a compact low-rank (i.e., low-parameter) format and then represents a simpler case of a tensor network.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Artificial Intelligence

2402.0289

Country: Europe > Russia (0.14)

Genre: Research Report (0.50)

Industry: Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast gradient-free activation maximization for neurons in spiking neural networks

Pospelov, Nikita, Chertkov, Andrei, Beketov, Maxim, Oseledets, Ivan, Anokhin, Konstantin

arXiv.org Artificial IntelligenceDec-28-2023

Neural networks (NNs), both living and artificial, work due to being complex systems of neurons, each having its own specialization. Revealing these specializations is important for understanding NNs inner working mechanisms. The only way to do this for a living system, the neural response of which to a stimulus is not a known (let alone differentiable) function is to build a feedback loop of exposing it to stimuli, the properties of which can be iteratively varied aiming in the direction of maximal response. To test such a loop on a living network, one should first learn how to run it quickly and efficiently, reaching most effective stimuli (ones that maximize certain neurons activation) in least possible number of iterations. We present a framework with an effective design of such a loop, successfully testing it on an artificial spiking neural network (SNN, a model that mimics the behaviour of NNs in living brains). Our optimization method used for activation maximization (AM) was based on low-rank tensor decomposition (Tensor Train, TT) of the activation function's discretization over its domain the latent parameter space of stimuli (CIFAR10-size color images, generated by either VQ-VAE or SN-GAN from their latent description vectors, fed to the SNN). To our knowledge, the present work is the first attempt to perform effective AM for SNNs. The source code of our framework, MANGO (for Maximization of neural Activation via Non-Gradient Optimization) is available on GitHub.

artificial intelligence, machine learning, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

2401.10748

Country:

Europe > Russia (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Translate your gibberish: black-box adversarial attack on machine translation systems

Chertkov, Andrei, Tsymboi, Olga, Pautov, Mikhail, Oseledets, Ivan

arXiv.org Artificial IntelligenceMay-23-2023

Neural networks are deployed widely in natural language processing tasks on the industrial scale, and perhaps the most often they are used as compounds of automatic machine translation systems. In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. Using a novel black-box gradient-free tensor-based optimizer, we show that many online translation tools, such as Google, DeepL, and Yandex, may both produce wrong or offensive translations for nonsensical adversarial input queries and refuse to translate seemingly benign input phrases. This vulnerability may interfere with understanding a new language and simply worsen the user's experience while using machine translation systems, and, hence, additional improvements of these tools are required to establish better translation.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.10974

Country:

Europe (0.69)
Asia (0.68)
North America > United States > New Mexico (0.14)

Genre: Research Report (0.40)

Industry:

Education (0.68)
Transportation > Air (0.62)
Information Technology > Security & Privacy (0.43)
Government > Military (0.43)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback