AITopics | Terekhov, Mikhail

Collaborating Authors

Terekhov, Mikhail

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards

Karzanov, Daniil, Garzón, Rubén, Terekhov, Mikhail, Gulcehre, Caglar, Raffinot, Thomas, Detyniecki, Marcin

arXiv.org Artificial IntelligenceFeb-4-2025

This paper introduces a novel agent-based approach for enhancing existing portfolio strategies using Proximal Policy Optimization (PPO). Rather than focusing solely on traditional portfolio construction, our approach aims to improve an already high-performing strategy through dynamic rebalancing driven by PPO and Oracle agents. Our target is to enhance the traditional 60/40 benchmark (60% stocks, 40% bonds) by employing the Regret-based Sharpe reward function. To address the impact of transaction fee frictions and prevent signal loss, we develop a transaction cost scheduler. We introduce a future-looking reward function and employ synthetic data training through a circular block bootstrap method to facilitate the learning of generalizable allocation strategies. We focus on two key evaluation measures: return and maximum drawdown. Given the high stochasticity of financial markets, we train 20 independent agents each period and evaluate their average performance against the benchmark. Our method not only enhances the performance of the existing portfolio strategy through strategic rebalancing but also demonstrates strong results compared to other baselines.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2502.02619

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.82)

Industry:

Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Upstream (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Contextual Bandit Optimization with Pre-Trained Neural Networks

Terekhov, Mikhail

arXiv.org Artificial IntelligenceJan-9-2025

Bandit optimization is a difficult problem, especially if the reward model is high-dimensional. When rewards are modeled by neural networks, sublinear regret has only been shown under strong assumptions, usually when the network is extremely wide. In this thesis, we investigate how pre-training can help us in the regime of smaller models. We consider a stochastic contextual bandit with the rewards modeled by a multi-layer neural network. The last layer is a linear predictor, and the layers before it are a black box neural architecture, which we call a representation network. We model pre-training as an initial guess of the weights of the representation network provided to the learner. To leverage the pre-trained weights, we introduce a novel algorithm we call Explore Twice then Commit (E2TC). During its two stages of exploration, the algorithm first estimates the last layer's weights using Ridge regression, and then runs Stochastic Gradient Decent jointly on all the weights. For a locally convex loss function, we provide conditions on the pre-trained weights under which the algorithm can learn efficiently. Under these conditions, we show sublinear regret of E2TC when the dimension of the last layer and number of actions $K$ are much smaller than the horizon $T$. In the weak training regime, when only the last layer is learned, the problem reduces to a misspecified linear bandit. We introduce a measure of misspecification $\epsilon_0$ for this bandit and use it to provide bounds $O(\epsilon_0\sqrt{d}KT+(KT)^{4 /5})$ or $\tilde{O}(\epsilon_0\sqrt{d}KT+d^{1 /3}(KT)^{2 /3})$ on the regret, depending on regularization strength. The first of these bounds has a dimension-independent sublinear term, made possible by the stochasticity of contexts. We also run experiments to evaluate the regret of E2TC and sample complexity of its exploration in practice.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.06258

Country: Europe (0.45)

Genre: Research Report > New Finding (0.92)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Surkov, Viacheslav, Wendler, Chris, Terekhov, Mikhail, Deschenaux, Justin, West, Robert, Gulcehre, Caglar

arXiv.org Artificial IntelligenceDec-11-2024

Sparse autoencoders (SAEs) have become a core ingredient in the reverse engineering of large-language models (LLMs). For LLMs, they have been shown to decompose intermediate representations that often are not interpretable directly into sparse sums of interpretable features, facilitating better control and subsequent analysis. However, similar analyses and approaches have been lacking for text-to-image models. We investigated the possibility of using SAEs to learn interpretable features for a few-step text-to-image diffusion models, such as SDXL Turbo. To this end, we train SAEs on the updates performed by transformer blocks within SDXL Turbo's denoising U-net. We find that their learned features are interpretable, causally influence the generation process, and reveal specialization among the blocks. In particular, we find one block that deals mainly with image composition, one that is mainly responsible for adding local details, and one for color, illumination, and style. Therefore, our work is an important first step towards better understanding the internals of generative text-to-image models like SDXL Turbo and showcases the potential of features learned by SAEs for the visual domain. Code is available at https://github.com/surkovv/sdxl-unbox

large language model, machine learning, sdxl turbo, (17 more...)

arXiv.org Artificial Intelligence

2410.22366

Country: Asia > Middle East (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Learning pure quantum states (almost) without regret

Lumbreras, Josep, Terekhov, Mikhail, Tomamichel, Marco

arXiv.org Machine LearningJun-26-2024

We initiate the study of quantum state tomography with minimal regret. A learner has sequential oracle access to an unknown pure quantum state, and in each round selects a pure probe state. Regret is incurred if the unknown state is measured orthogonal to this probe, and the learner's goal is to minimise the expected cumulative regret over $T$ rounds. The challenge is to find a balance between the most informative measurements and measurements incurring minimal regret. We show that the cumulative regret scales as $\Theta(\operatorname{polylog} T)$ using a new tomography algorithm based on a median of means least squares estimator. This algorithm employs measurements biased towards the unknown state and produces online estimates that are optimal (up to logarithmic terms) in the number of observed samples.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

2406.1837

Country:

Asia > Singapore (0.14)
Europe (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback