AITopics | Oseledets, Ivan

Collaborating Authors

Oseledets, Ivan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Collaborative Filtering for Matrix- and Tensor-based Recommender Systems

Saiapin, Albert, Oseledets, Ivan, Frolov, Evgeny

arXiv.org Artificial IntelligenceDec-4-2023

In production applications of recommender systems, a continuous data flow is employed to update models in real-time. Many recommender models often require complete retraining to adapt to new data. In this work, we introduce a novel collaborative filtering model for sequential problems known as Tucker Integrator Recommender - TIRecA. TIRecA efficiently updates its parameters using only the new data segment, allowing incremental addition of new users and items to the recommender system. To demonstrate the effectiveness of the proposed model, we conducted experiments on four publicly available datasets: MovieLens 20M, Amazon Beauty, Amazon Toys and Games, and Steam. Our comparison with general matrix and tensor-based baselines in terms of prediction quality and computational time reveals that TIRecA achieves comparable quality to the baseline methods, while being 10-20 times faster in training time.

artificial intelligence, machine learning, tireca, (18 more...)

arXiv.org Artificial Intelligence

2312.10064

Country:

Europe (0.46)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models

Razzhigaev, Anton, Mikhalchuk, Matvey, Goncharova, Elizaveta, Oseledets, Ivan, Dimitrov, Denis, Kuznetsov, Andrey

arXiv.org Artificial IntelligenceNov-10-2023

In this study, we present an investigation into the anisotropy dynamics and intrinsic dimension of embeddings in transformer architectures, focusing on the dichotomy between encoders and decoders. Our findings reveal that the anisotropy profile in transformer decoders exhibits a distinct bell-shaped curve, with the highest anisotropy concentrations in the middle layers. This pattern diverges from the more uniformly distributed anisotropy observed in encoders. In addition, we found that the intrinsic dimension of embeddings increases in the initial phases of training, indicating an expansion into higher-dimensional space. Which is then followed by a compression phase towards the end of training with dimensionality decrease, suggesting a refinement into more compact representations. Our results provide fresh insights to the understanding of encoders and decoders embedding properties.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.05928

Country:

North America > Canada (0.29)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spectal Harmonics: Bridging Spectral Embedding and Matrix Completion in Self-Supervised Learning

Munkhoeva, Marina, Oseledets, Ivan

arXiv.org Artificial IntelligenceOct-30-2023

Self-supervised methods received tremendous attention thanks to their seemingly heuristic approach to learning representations that respect the semantics of the data without any apparent supervision in the form of labels. A growing body of literature is already being published in an attempt to build a coherent and theoretically grounded understanding of the workings of a zoo of losses used in modern self-supervised representation learning methods. In this paper, we attempt to provide an understanding from the perspective of a Laplace operator and connect the inductive bias stemming from the augmentation process to a low-rank matrix completion problem. To this end, we leverage the results from low-rank matrix completion to provide theoretical analysis on the convergence of modern SSL methods and a key property that affects their downstream performance.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2305.19818

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Memory-efficient particle filter recurrent neural network for object localization

Korkin, Roman, Oseledets, Ivan, Katrutsa, Aleksandr

arXiv.org Artificial IntelligenceOct-2-2023

This study proposes a novel memory-efficient recurrent neural network (RNN) architecture specified to solve the object localization problem. This problem is to recover the object states along with its movement in a noisy environment. We take the idea of the classical particle filter and combine it with GRU RNN architecture. The key feature of the resulting memory-efficient particle filter RNN model (mePFRNN) is that it requires the same number of parameters to process environments of different sizes. Thus, the proposed mePFRNN architecture consumes less memory to store parameters compared to the previously proposed PFRNN model. To demonstrate the performance of our model, we test it on symmetric and noisy environments that are incredibly challenging for filtering algorithms. In our experiments, the mePFRNN model provides more precise localization than the considered competitors and requires fewer trained parameters.

artificial intelligence, machine learning, particle, (17 more...)

arXiv.org Artificial Intelligence

2310.01595

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer

Leplat, Valentin, Merkulov, Daniil, Katrutsa, Aleksandr, Bershatsky, Daniel, Tsymboi, Olga, Oseledets, Ivan

arXiv.org Artificial IntelligenceSep-30-2023

Classical machine learning models such as deep neural networks are usually trained by using Stochastic Gradient Descent-based (SGD) algorithms. The classical SGD can be interpreted as a discretization of the stochastic gradient flow. In this paper we propose a novel, robust and accelerated stochastic optimizer that relies on two key elements: (1) an accelerated Nesterov-like Stochastic Differential Equation (SDE) and (2) its semi-implicit Gauss-Seidel type discretization. The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively in the case of the minimization of a quadratic function. This analysis allows us to come up with an optimal learning rate in terms of the convergence rate while ensuring the stability of NAG-GS. This is achieved by the careful analysis of the spectral radius of the iteration matrix and the covariance matrix at stationarity with respect to all hyperparameters of our method. Further, we show that NAG- GS is competitive with state-of-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models such as the logistic regression model, the residual networks models on standard computer vision datasets, Transformers in the frame of the GLUE benchmark and the recent Vision Transformers.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2209.14937

Country:

Europe > Russia (0.14)
Asia (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

Quantization Aware Factorization for Deep Neural Network Compression

Cherniuk, Daria, Abukhovich, Stanislav, Phan, Anh-Huy, Oseledets, Ivan, Cichocki, Andrzej, Gusak, Julia

arXiv.org Artificial IntelligenceAug-8-2023

Tensor decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks. Due to memory and power consumption limitations of mobile or embedded devices, the quantization step is usually necessary when pre-trained models are deployed. A conventional post-training quantization approach applied to networks with decomposed weights yields a drop in accuracy. This motivated us to develop an algorithm that finds tensor approximation directly with quantized factors and thus benefit from both compression techniques while keeping the prediction quality of the model. Namely, we propose to use Alternating Direction Method of Multipliers (ADMM) for Canonical Polyadic (CP) decomposition with factors whose elements lie on a specified quantization grid. We compress neural network weights with a devised algorithm and evaluate it's prediction quality and performance. We compare our approach to state-of-the-art post-training quantization methods and demonstrate competitive results and high flexibility in achiving a desirable quality-performance tradeoff.

artificial intelligence, machine learning, quantization, (17 more...)

arXiv.org Artificial Intelligence

2308.04595

Genre:

Research Report (0.50)
Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Efficient GPT Model Pre-training using Tensor Train Matrix Representation

Chekalina, Viktoriia, Novikov, Georgii, Gusak, Julia, Oseledets, Ivan, Panchenko, Alexander

arXiv.org Artificial IntelligenceJun-5-2023

Large-scale transformer models have shown remarkable performance in language modelling tasks. However, such models feature billions of parameters, leading to difficulties in their deployment and prohibitive training costs from scratch. To reduce the number of the parameters in the GPT-2 architecture, we replace the matrices of fully-connected layers with the corresponding Tensor Train Matrix~(TTM) structure. Finally, we customize forward and backward operations through the TTM-based layer for simplicity and the stableness of further training. % The resulting GPT-2-based model stores up to 40% fewer parameters, showing the perplexity comparable to the original model. On the downstream tasks, including language understanding and text summarization, the model performs similarly to the original GPT-2 model. The proposed tensorized layers could be used to efficiently pre-training other Transformer models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.02697

Country: Europe > Germany (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Translate your gibberish: black-box adversarial attack on machine translation systems

Chertkov, Andrei, Tsymboi, Olga, Pautov, Mikhail, Oseledets, Ivan

arXiv.org Artificial IntelligenceMay-23-2023

Neural networks are deployed widely in natural language processing tasks on the industrial scale, and perhaps the most often they are used as compounds of automatic machine translation systems. In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. Using a novel black-box gradient-free tensor-based optimizer, we show that many online translation tools, such as Google, DeepL, and Yandex, may both produce wrong or offensive translations for nonsensical adversarial input queries and refuse to translate seemingly benign input phrases. This vulnerability may interfere with understanding a new language and simply worsen the user's experience while using machine translation systems, and, hence, additional improvements of these tools are required to establish better translation.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2303.10974

Country:

Europe (0.69)
Asia (0.68)
North America > United States (0.68)

Genre: Research Report (0.40)

Industry:

Education (0.68)
Transportation > Air (0.62)
Information Technology > Security & Privacy (0.43)
Government > Military (0.43)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

TT-SDF2PC: Registration of Point Cloud and Compressed SDF Directly in the Memory-Efficient Tensor Train Domain

Boyko, Alexey I., Kornilova, Anastasiia, Tariverdizadeh, Rahim, Musavian, Mirfarid, Markeeva, Larisa, Oseledets, Ivan, Ferrer, Gonzalo

arXiv.org Artificial IntelligenceApr-11-2023

Abstract-- This paper addresses the following research question: "can one compress a detailed 3D representation and use it directly for point cloud registration?". Map compression of the scene can be achieved by the tensor train (TT) decomposition of the signed distance function (SDF) representation. It regulates the amount of data reduced by the so-called TT-ranks. Using this representation we have proposed an algorithm, the TT-SDF2PC, that is capable of directly registering a PC to the compressed SDF by making use of efficient calculations of its derivatives in the TT domain, saving computations and memory. We compare TT-SDF2PC with SOTA local and global registration methods in a synthetic dataset and a real dataset and show on par performance while requiring significantly less resources.

artificial intelligence, machine learning, registration, (16 more...)

arXiv.org Artificial Intelligence

2304.05342

Country: Europe > Austria (0.28)

Genre: Research Report > Experimental Study > Negative Result (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multiparticle Kalman filter for object localization in symmetric environments

Korkin, Roman, Oseledets, Ivan, Katrutsa, Aleksandr

arXiv.org Artificial IntelligenceMar-14-2023

This study considers the object localization problem and proposes a novel multiparticle Kalman filter to solve it in complex and symmetric environments. Two well-known classes of filtering algorithms to solve the localization problem are Kalman filter-based methods and particle filter-based methods. We consider these classes, demonstrate their complementary properties, and propose a novel filtering algorithm that takes the best from two classes. We evaluate the multiparticle Kalman filter in symmetric and noisy environments. Such environments are especially challenging for both classes of classical methods. We compare the proposed approach with the particle filter since only this method is feasible if the initial state is unknown. In the considered challenging environments, our method outperforms the particle filter in terms of both localization error and runtime.

evolutionary algorithm, machine learning, particle filter, (18 more...)

arXiv.org Artificial Intelligence

2303.07897

Country:

Europe > Russia (0.29)
Asia > Russia (0.28)

Genre: Research Report (0.50)

Industry: Transportation (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback