AITopics | Oseledets, Ivan

Plotting

Oseledets, Ivan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction

Sayapin, Albert, Balitskiy, Gleb, Bershatsky, Daniel, Katrutsa, Aleksandr, Frolov, Evgeny, Frolov, Alexey, Oseledets, Ivan, Kharin, Vitaliy

arXiv.org Artificial IntelligenceFeb-5-2023

In this study, we propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. Although this problem can be represented as a classical collaborative filtering problem, it requires proper modification since the data are sequential, the user feedback is distributed among devices and the transmission of users' data to aggregate common patterns must be protected against leakage. According to such requirements, we modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. Since the data about user experience are distributed among devices, the federated learning setup is used to train the proposed sequential matrix factorization model. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server. To demonstrate the efficiency of the proposed model we use publicly available mobile user behavior data. We compare our model with sequential rules and models based on the frequency of app launches. The comparison is conducted in static and dynamic environments. The static environment evaluates how our model processes sequential data compared to competitors. Therefore, the standard train-validation-test evaluation procedure is used. The dynamic environment emulates the real-world scenario, where users generate new data by running apps on devices, and evaluates our model in this case. Our experiments show that the proposed model provides comparable quality with other methods in the static environment. However, more importantly, our method achieves a better privacy-utility trade-off than competitors in the dynamic environment, which provides more accurate simulations of real-world usage.

artificial intelligence, machine learning, mechanism, (16 more...)

arXiv.org Artificial Intelligence

2303.04744

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Machine learning methods for prediction of breakthrough curves in reactive porous media

Fokina, Daria, Toktaliev, Pavel, Iliev, Oleg, Oseledets, Ivan

arXiv.org Artificial IntelligenceJan-12-2023

Reactive flows in porous media play an important role in our life and are crucial for many industrial, environmental and biomedical applications. Very often the concentration of the species at the inlet is known, and the so-called breakthrough curves, measured at the outlet, are the quantities which could be measured or computed numerically. The measurements and the simulations could be time-consuming and expensive, and machine learning and Big Data approaches can help to predict breakthrough curves at lower costs. Machine learning (ML) methods, such as Gaussian processes and fully-connected neural networks, and a tensor method, cross approximation, are well suited for predicting breakthrough curves. In this paper, we demonstrate their performance in the case of pore scale reactive flow in catalytic filters.

artificial intelligence, machine learning, reaction, (15 more...)

arXiv.org Artificial Intelligence

2301.04998

Country: Europe > Germany (0.16)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Mitigating Human and Computer Opinion Fraud via Contrastive Learning

Tukmacheva, Yuliya, Oseledets, Ivan, Frolov, Evgeny

arXiv.org Artificial IntelligenceJan-8-2023

These platforms collect data about both users' and items' attributes, as well as accumulate the ratings and feedback of products and services, to develop algorithms for significant enhancement of users' experience on the marketplace. These algorithms are capable of influencing the purchasing behavior of users by (1) offering them the selection of the most relevant personalized positions, (2) reducing the individual searching costs, and (3) alleviating the information asymmetry on large commercial platforms with homogeneous sellers and products through feedback mechanisms. Since recommender systems have the power to affect the marketing decisions of users, they have become an attractive target for ratings and reviews manipulations, also known as attacks. Specifically, these attacks are aimed at inflating/deflating the ranks and text reviews of certain product positions or at simply sabotaging the efficiency and credibility of the the commercial platform in general. The current study focuses on solving the task of filtering out the deceptive opinions and detecting anomalous behavior on a platform with text reviews. The emphasis on text reviews can be explained by the fact that texts are a more informative and a more reliable source of product's and seller's quality, than a star-rating system, which is easy to manipulate (see [19], [14], [27], [28]).

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2301.03025

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

FMM-Net: neural network architecture based on the Fast Multipole Method

Sushnikova, Daria, Kharyuk, Pavel, Oseledets, Ivan

arXiv.org Artificial IntelligenceDec-25-2022

In this paper, we propose a new neural network architecture based on the H2 matrix. Even though networks with H2-inspired architecture already exist, and our approach is designed to reduce memory costs and improve performance by taking into account the sparsity template of the H2 matrix. In numerical comparison with alternative neural networks, including the known H2-based ones, our architecture showed itself as beneficial in terms of performance, memory, and scalability.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2212.12899

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Tensor-based Sequential Learning via Hankel Matrix Representation for Next Item Recommendations

Frolov, Evgeny, Oseledets, Ivan

arXiv.org Artificial IntelligenceDec-12-2022

Self-attentive transformer models have recently been shown to solve the next item recommendation task very efficiently. The learned attention weights capture sequential dynamics in user behavior and generalize well. Motivated by the special structure of learned parameter space, we question if it is possible to mimic it with an alternative and more lightweight approach. We develop a new tensor factorization-based model that ingrains the structural knowledge about sequential data within the learning process. We demonstrate how certain properties of a self-attention network can be reproduced with our approach based on special Hankel matrix representation. The resulting model has a shallow linear architecture and compares competitively to its neural counterpart.

data mining, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2212.0572

Genre: Research Report (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A case study of spatiotemporal forecasting techniques for weather forecasting

Sofi, Shakir Showkat, Oseledets, Ivan

arXiv.org Artificial IntelligenceSep-29-2022

The majority of real-world processes are spatiotemporal, and the data generated by them exhibits both spatial and temporal evolution. Weather is one of the most important processes that fall under this domain, and forecasting it has become a crucial part of our daily routine. Weather data analysis is considered the most complex and challenging task. Although numerical weather prediction models are currently state-of-the-art, they are resource intensive and time-consuming. Numerous studies have proposed time-series-based models as a viable alternative to numerical forecasts. Recent research has primarily focused on forecasting weather at a specific location. Therefore, models can only capture temporal correlations. This self-contained paper explores various methods for regional data-driven weather forecasting, i.e., forecasting over multiple latitude-longitude points to capture spatiotemporal correlations. The results showed that spatiotemporal prediction models reduced computational cost while improving accuracy; in particular, the proposed tensor train dynamic mode decomposition-based forecasting model has comparable accuracy to ConvLSTM without the need for training. We use the NASA POWER meteorological dataset to evaluate the models and compare them with the current state of the art.

data mining, machine learning, prediction, (20 more...)

arXiv.org Artificial Intelligence

2209.14782

Country:

Asia (1.00)
Europe (0.93)
North America > United States (0.66)

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Renewable > Solar (0.68)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Extension of Dynamic Mode Decomposition for dynamic systems with incomplete information based on t-model of optimal prediction

Katrutsa, Aleksandr, Utyuzhnikov, Sergey, Oseledets, Ivan

arXiv.org Artificial IntelligenceFeb-23-2022

The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data. This is entirely a data-driven approach that extracts all necessary information from data snapshots which are commonly supposed to be sampled from measurement. The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured. Such setting occurs very often in modeling complex dynamical systems such as power grids, in particular with reduced-order modeling. To take into account the effect of unresolved variables the optimal prediction approach based on the Mori-Zwanzig formalism can be applied to obtain the most expected prediction under existing uncertainties. This effectively leads to the development of a time-predictive model accounting for the impact of missing data. In the present paper we provide a detailed derivation of the considered method from the Liouville equation and finalize it with the optimization problem that defines the optimal transition operator corresponding to the observed data. In contrast to the existing approach, we consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method. The gradient of the obtained objective function is computed precisely through the automatic differentiation technique. The numerical experiments illustrate that the considered approach gives practically the same dynamics as the exact Mori-Zwanzig decomposition, but is less computationally intensive.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jcp.2023.111913

2202.11432

Country: Europe > Russia (0.28)

Genre: Research Report (0.82)

Industry: Energy (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Smoothed Embeddings for Certified Few-Shot Learning

Pautov, Mikhail, Kuznetsova, Olesya, Tursynbek, Nurislam, Petiushko, Aleksandr, Oseledets, Ivan

arXiv.org Artificial IntelligenceFeb-2-2022

Randomized smoothing is considered to be the state-of-the-art provable defense against adversarial perturbations. However, it heavily exploits the fact that classifiers map input objects to class probabilities and do not focus on the ones that learn a metric space in which classification is performed by computing distances to embeddings of classes prototypes. In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. We provide analysis of Lipschitz continuity of such models and derive robustness certificate against $\ell_2$-bounded perturbations that may be useful in few-shot learning scenarios. Our theoretical results are confirmed by experiments on different datasets.

algorithm 1, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2202.01186

Country:

Europe > Russia (0.14)
Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Novikov, Georgii, Bershatsky, Daniel, Gusak, Julia, Shonenkov, Alex, Dimitrov, Denis, Oseledets, Ivan

arXiv.org Artificial IntelligenceFeb-2-2022

Memory footprint is one of the main limiting factors for large neural network training. In backpropagation, one needs to store the input to each operation in the computational graph. Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of the gradients. We propose a systematic approach to compute optimal quantization of the retained gradients of the pointwise nonlinear functions with only a few bits per each element. We show that such approximation can be achieved by computing optimal piecewise-constant approximation of the derivative of the activation function, which can be done by dynamic programming. The drop-in replacements are implemented for all popular nonlinearities and can be used in any existing pipeline. We confirm the memory reduction and the same convergence on several open benchmarks.

approximation, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2202.00441

Country: North America > United States > California (0.14)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Memory-Efficient Backpropagation through Large Linear Layers

Bershatsky, Daniel, Mikhalev, Aleksandr, Katrutsa, Alexandr, Gusak, Julia, Merkulov, Daniil, Oseledets, Ivan

arXiv.org Machine LearningFeb-2-2022

In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass. This study proposes a memory reduction approach to perform backpropagation through linear layers. Since the gradients of linear layers are computed by matrix multiplications, we consider methods for randomized matrix multiplications and demonstrate that they require less memory with a moderate decrease of the test accuracy. Also, we investigate the variance of the gradient estimate induced by the randomized matrix multiplication. We compare this variance with the variance coming from gradient estimation based on the batch of samples. We demonstrate the benefits of the proposed method on the fine-tuning of the pre-trained RoBERTa model on GLUE tasks.

artificial intelligence, machine learning, variance, (17 more...)

arXiv.org Machine Learning

2201.13195

Country:

Europe > Russia (0.14)
Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback