AITopics | Zaytsev, Alexey

Collaborating Authors

Zaytsev, Alexey

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When an LLM is apprehensive about its answers -- and when its uncertainty is justified

Sychev, Petr, Goncharov, Andrey, Vyazhev, Daniil, Khalafyan, Edvard, Zaytsev, Alexey

arXiv.org Artificial IntelligenceMar-3-2025

Uncertainty estimation is crucial for evaluating Large Language Models (LLMs), particularly in high-stakes domains where incorrect answers result in significant consequences. Numerous approaches consider this problem, while focusing on a specific type of uncertainty, ignoring others. We investigate what estimates, specifically token-wise entropy and model-as-judge (MASJ), would work for multiple-choice question-answering tasks for different question topics. Our experiments consider three LLMs: Phi-4, Mistral, and Qwen of different sizes from 1.5B to 72B and $14$ topics. While MASJ performs similarly to a random error predictor, the response entropy predicts model error in knowledge-dependent domains and serves as an effective indicator of question difficulty: for biology ROC AUC is $0.73$. This correlation vanishes for the reasoning-dependent domain: for math questions ROC-AUC is $0.55$. More principally, we found out that the entropy measure required a reasoning amount. Thus, data-uncertainty related entropy should be integrated within uncertainty estimates frameworks, while MASJ requires refinement. Moreover, existing MMLU-Pro samples are biased, and should balance required amount of reasoning for different subdomains to provide a more fair assessment of LLMs performance.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.01688

Country: Europe > Russia (0.16)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Concealed Adversarial attacks on neural networks for sequential data

Sokerin, Petr, Anikin, Dmitry, Krehova, Sofia, Zaytsev, Alexey

arXiv.org Artificial IntelligenceFeb-28-2025

The emergence of deep learning led to the broad usage of neural networks in the time series domain for various applications, including finance and medicine. While powerful, these models are prone to adversarial attacks: a benign targeted perturbation of input data leads to significant changes in a classifier's output. However, formally small attacks in the time series domain become easily detected by the human eye or a simple detector model. We develop a concealed adversarial attack for different time-series models: it provides more realistic perturbations, being hard to detect by a human or model discriminator. To achieve this goal, the proposed adversarial attack maximizes an aggregation of a classifier and a trained discriminator loss. To make the attack stronger, we also propose a training procedure for a discriminator that provides broader coverage of possible attacks. Extensive benchmarking on six UCR time series datasets across four diverse architectures - including recurrent, convolutional, state-space, and transformer-based models - demonstrates the superiority of our attack for a concealability-efficiency trade-off. Our findings highlight the growing challenge of designing robust time series models, emphasizing the need for improved defenses against realistic and effective attacks.

adversarial attack, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.20948

Country:

Europe > Russia (0.14)
Asia (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Looking around you: external information enhances representations for event sequences

Kovaleva, Maria, Sokerin, Petr, Krehova, Sofia, Zaytsev, Alexey

arXiv.org Artificial IntelligenceFeb-14-2025

Representation learning produces models in different domains, such as store purchases, client transactions, and general people's behaviour. However, such models for sequential data usually process a single sequence, ignoring context from other relevant ones, even in domains with rapidly changing external environments like finance or misguiding the prediction for a user with no recent events. We are the first to propose a method that aggregates information from multiple user representations augmenting a specific user one for a scenario of multiple co-occurring event sequences. Our study considers diverse aggregation approaches, ranging from simple pooling techniques to trainable attention-based approaches, especially Kernel attention aggregation, that can highlight more complex information flow from other users. The proposed method operates atop an existing encoder and supports its efficient fine-tuning. Across considered datasets of financial transactions and downstream tasks, Kernel attention improves ROC AUC scores, both with and without fine-tuning, while mean pooling yields a smaller but still significant gain.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.10205

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

Foundation for unbiased cross-validation of spatio-temporal models for species distribution modeling

Koldasbayeva, Diana, Zaytsev, Alexey

arXiv.org Artificial IntelligenceJan-27-2025

Species Distribution Models (SDMs) often suffer from spatial autocorrelation (SAC), leading to biased performance estimates. We tested cross-validation (CV) strategies - random splits, spatial blocking with varied distances, environmental (ENV) clustering, and a novel spatio-temporal method - under two proposed training schemes: LAST FOLD, widely used in spatial CV at the cost of data loss, and RETRAIN, which maximizes data usage but risks reintroducing SAC. LAST FOLD consistently yielded lower errors and stronger correlations. Spatial blocking at an optimal distance (SP 422) and ENV performed best, achieving Spearman and Pearson correlations of 0.485 and 0.548, respectively, although ENV may be unsuitable for long-term forecasts involving major environmental shifts. A spatio-temporal approach yielded modest benefits in our moderately variable dataset, but may excel with stronger temporal changes. These findings highlight the need to align CV approaches with the spatial and temporal structure of SDM data, ensuring rigorous validation and reliable predictive outcomes.

artificial intelligence, correlation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.0348

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Normalizing self-supervised learning for provably reliable Change Point Detection

Bazarova, Alexandra, Romanenkova, Evgenia, Zaytsev, Alexey

arXiv.org Artificial IntelligenceDec-3-2024

Change point detection (CPD) methods aim to identify abrupt shifts in the distribution of input data streams. Accurate estimators for this task are crucial across various real-world scenarios. Yet, traditional unsupervised CPD techniques face significant limitations, often relying on strong assumptions or suffering from low expressive power due to inherent model simplicity. In contrast, representation learning methods overcome these drawbacks by offering flexibility and the ability to capture the full complexity of the data without imposing restrictive assumptions. However, these approaches are still emerging in the CPD field and lack robust theoretical foundations to ensure their reliability. Our work addresses this gap by integrating the expressive power of representation learning with the groundedness of traditional CPD techniques. We adopt spectral normalization (SN) for deep representation learning in CPD tasks and prove that the embeddings after SN are highly informative for CPD. Our method significantly outperforms current state-of-the-art methods during the comprehensive evaluation via three standard CPD datasets.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.13637

Country:

Europe > Russia (0.14)
North America > Canada (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Collusion Detection with Graph Neural Networks

Gomes, Lucas, Kueck, Jannis, Mattes, Mara, Spindler, Martin, Zaytsev, Alexey

arXiv.org Machine LearningOct-9-2024

Collusion is a complex phenomenon in which companies secretly collaborate to engage in fraudulent practices. This paper presents an innovative methodology for detecting and predicting collusion patterns in different national markets using neural networks (NNs) and graph neural networks (GNNs). GNNs are particularly well suited to this task because they can exploit the inherent network structures present in collusion and many other economic problems. Our approach consists of two phases: In Phase I, we develop and train models on individual market datasets from Japan, the United States, two regions in Switzerland, Italy, and Brazil, focusing on predicting collusion in single markets. In Phase II, we extend the models' applicability through zero-shot learning, employing a transfer learning approach that can detect collusion in markets in which training data is unavailable. This phase also incorporates out-of-distribution (OOD) generalization to evaluate the models' performance on unseen datasets from other countries and regions. In our empirical study, we show that GNNs outperform NNs in detecting complex collusive patterns. This research contributes to the ongoing discourse on preventing collusion and optimizing detection methodologies, providing valuable guidance on the use of NNs and GNNs in economic applications to enhance market fairness and economic welfare.

artificial intelligence, machine learning, survey article, (16 more...)

arXiv.org Machine Learning

2410.07091

Country:

Europe (1.00)
North America > United States (0.88)

Genre: Research Report > New Finding (0.67)

Industry:

Law Enforcement & Public Safety > Fraud (0.53)
Law > Business Law (0.46)
Energy > Oil & Gas (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Universal representations for financial transactional data: embracing local, global, and external contexts

Bazarova, Alexandra, Kovaleva, Maria, Kuleshov, Ilya, Romanenkova, Evgenia, Stepikin, Alexander, Yugay, Alexandr, Mollaev, Dzhambulat, Kireev, Ivan, Savchenko, Andrey, Zaytsev, Alexey

arXiv.org Artificial IntelligenceApr-2-2024

Effective processing of financial transactions is essential for banking data analysis. However, in this domain, most methods focus on specialized solutions to stand-alone problems instead of constructing universal representations suitable for many problems. We present a representation learning framework that addresses diverse business challenges. We also suggest novel generative models that account for data specifics, and a way to integrate external information into a client's representation, leveraging insights from other customers' actions. Finally, we offer a benchmark, describing representation quality globally, concerning the entire transaction history; locally, reflecting the client's current state; and dynamically, capturing representation evolution over time. Our generative approach demonstrates superior performance in local tasks, with an increase in ROC-AUC of up to 14\% for the next MCC prediction task and up to 46\% for downstream tasks from existing contrastive baselines. Incorporating external information improves the scores by an additional 20\%.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2404.02047

Country:

Europe (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (0.93)

Industry:

Banking & Finance > Credit (0.68)
Banking & Finance > Loans (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Diversity-Aware Ensembling of Language Models Based on Topological Data Analysis

Proskura, Polina, Zaytsev, Alexey

arXiv.org Artificial IntelligenceFeb-21-2024

Ensembles are important tools for improving the performance of machine learning models. In cases related to natural language processing, ensembles boost the performance of a method due to multiple large models available in open source. However, existing approaches mostly rely on simple averaging of predictions by ensembles with equal weights for each model, ignoring differences in the quality and conformity of models. We propose to estimate weights for ensembles of NLP models using not only knowledge of their individual performance but also their similarity to each other. By adopting distance measures based on Topological Data Analysis (TDA), we improve our ensemble. The quality improves for both text classification accuracy and relevant uncertainty estimation.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.14184

Country:

North America > United States (0.14)
Asia > Middle East > UAE (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

From Variability to Stability: Advancing RecSys Benchmarking Practices

Shevchenko, Valeriy, Belousov, Nikita, Vasilev, Alexey, Zholobov, Vladimir, Sosedka, Artyom, Semenova, Natalia, Volodkevich, Anna, Savchenko, Andrey, Zaytsev, Alexey

arXiv.org Artificial IntelligenceFeb-15-2024

In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to holistically reflect their effectiveness due to the significant impact of dataset characteristics on algorithm performance. Addressing this deficiency, this paper introduces a novel benchmarking methodology to facilitate a fair and robust comparison of RecSys algorithms, thereby advancing evaluation practices. By utilizing a diverse set of $30$ open datasets, including two introduced in this work, and evaluating $11$ collaborative filtering algorithms across $9$ metrics, we critically examine the influence of dataset characteristics on algorithm performance. We further investigate the feasibility of aggregating outcomes from multiple datasets into a unified ranking. Through rigorous experimental analysis, we validate the reliability of our methodology under the variability of datasets, offering a benchmarking strategy that balances quality and computational demands. This methodology enables a fair yet effective means of evaluating RecSys algorithms, providing valuable guidance for future research endeavors.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.09766

Country:

Europe (0.71)
Asia > China (0.28)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Long-term drought prediction using deep neural networks based on geospatial weather data

Grabar, Vsevolod, Marusov, Alexander, Maximov, Yury, Sotiriadi, Nazar, Bulkin, Alexander, Zaytsev, Alexey

arXiv.org Artificial IntelligenceJan-9-2024

The importance of monitoring and predicting droughts is underscored by their frequent occurrence in diverse geographical landscapes (Ghozat et al., 2023). Moreover, the likelihood of droughts is expected to increase in the context of global climate change (Xiujia et al., 2022). Their accurate forecasting, however, is a complex problem due to the inherent difficulty in predicting the onset, duration, and cessation of drought events (Mishra and Desai, 2005). This complexity necessitates the development of sophisticated forecasting models that can effectively navigate these challenges. To frame our problem, it is essential to define the prediction target and establish a suitable time horizon for forecasting (Zhang et al., 2019). Given our focus on long-term decision-making, we aim to generate forecasts that extend 12 months into the future. Selecting an appropriate target for drought prediction is more challenging due to its dependence on multiple climatic factors, including temperature and precipitation. Among the various drought severity indices, the Standardized Precipitation Index (SPI) (McKee et al., 1993) and the Palmer Drought Severity Index (PDSI) (Alley, 1984) stand out as fundamental measures.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2309.06212

Country:

Asia (1.00)
Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback