AITopics | Kiselev, Igor

Collaborating Authors

Kiselev, Igor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Yes, Q-learning Helps Offline In-Context RL

Tarasov, Denis, Nikulin, Alexander, Zisman, Ilya, Klepach, Albina, Polubarov, Andrei, Lyubaykin, Nikita, Derevyagin, Alexander, Kiselev, Igor, Kurenkov, Vladislav

arXiv.org Artificial IntelligenceFeb-24-2025

In this work, we explore the integration of Reinforcement Learning (RL) approaches within a scalable offline In-Context RL (ICRL) framework. Through experiments across more than 150 datasets derived from GridWorld and MuJoCo environments, we demonstrate that optimizing RL objectives improves performance by approximately 40% on average compared to the widely established Algorithm Distillation (AD) baseline across various dataset coverages, structures, expertise levels, and environmental complexities. Our results also reveal that offline RL-based methods outperform online approaches, which are not specifically designed for offline scenarios. These findings underscore the importance of aligning the learning objectives with RL's reward-maximization goal and demonstrate that offline RL is a promising direction for application in ICRL settings.

machine learning, natural language, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2502.17666

Country: Europe (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Latent Action Learning Requires Supervision in the Presence of Distractors

Nikulin, Alexander, Zisman, Ilya, Tarasov, Denis, Lyubaykin, Nikita, Polubarov, Andrei, Kiselev, Igor, Kurenkov, Vladislav

arXiv.org Artificial IntelligenceFeb-1-2025

Recently, latent action learning, pioneered by Latent Action Policies (LAPO), have shown remarkable pre-training efficiency on observation-only data, offering potential for leveraging vast amounts of video available on the web for embodied AI. However, prior work has focused on distractor-free data, where changes between observations are primarily explained by ground-truth actions. Unfortunately, real-world videos contain action-correlated distractors that may hinder latent action learning. Using Distracting Control Suite (DCS) we empirically investigate the effect of distractors on latent action learning and demonstrate that LAPO struggle in such scenario. We propose LAOM, a simple LAPO modification that improves the quality of latent actions by 8x, as measured by linear probing. Importantly, we show that providing supervision with ground-truth actions, as few as 2.5% of the full dataset, during latent action learning improves downstream performance by 4.2x on average. Our findings suggest that integrating supervision during Latent Action Models (LAM) training is critical in the presence of distractors, challenging the conventional pipeline of first learning LAM and only then decoding from latent to ground-truth actions.

artificial intelligence, latent action, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2502.00379

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian

Nikolich, Aleksandr, Korolev, Konstantin, Shelmanov, Artem, Kiselev, Igor

arXiv.org Artificial IntelligenceJun-19-2024

There has been a surge in the development of various Large Language Models (LLMs). However, text generation for languages other than English often faces significant challenges, including poor generation quality and the reduced computational performance due to the disproportionate representation of tokens in model's vocabulary. In this work, we address these issues and introduce Vikhr, a new state-of-the-art open-source instruction-tuned LLM designed specifically for the Russian language. Unlike previous efforts for Russian that utilize computationally inexpensive LoRA adapters on top of English-oriented models, Vikhr features an adapted tokenizer vocabulary and undergoes the continued pre-training and instruction tuning of all weights. This approach not only enhances the model's performance but also significantly improves its computational and contextual efficiency. The remarkable performance of Vikhr across various Russian-language benchmarks can also be attributed to our efforts in expanding instruction datasets and corpora for continued pre-training. Vikhr not only sets the new state of the art among open-source LLMs for Russian, but even outperforms some proprietary closed-source models on certain benchmarks. The model weights, instruction sets, and code are publicly available

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.13929

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

A Novel Single-DBN Generative Model for Optimizing POMDP Controllers by Probabilistic Inference

Kiselev, Igor (University of Waterloo) | Poupart, Pascal (University of Waterloo)

AAAI ConferencesJul-14-2014

As a promising alternative to using standard (often intractable) planning techniques with Bellman equations, we propose an interesting method of optimizing POMDP controllers by probabilistic inference in a novel equivalent single-DBN generative model. Our inference approach to POMDP planning allows for (1) for application of various techniques for probabilistic inference in single graphical models, and (2) for exploiting the factored structure in a controller architecture to take advantage of natural structural constrains of planning problems and represent them compactly. Our contributions can be summarized as follows: (1) we designed a novel single-DBN generative model that ensures that the task of probabilistic inference is equivalent to the original problem of optimizing POMDP controllers, and (2) we developed several inference approaches to approximate the value of the policy when exact inference methods are not tractable to solve large-size problems with complex graphical models. The proposed approaches to policy optimization by probabilistic inference are evaluated on several POMDP benchmark problems and the performance of the implemented approximation algorithms is compared.

artificial intelligence, machine learning, probabilistic inference, (14 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Ontario (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback