AITopics | Rakhuba, Maxim

Plotting

Rakhuba, Maxim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Knowledge Graph Completion with Mixed Geometry Tensor Factorization

Yusupov, Viacheslav, Rakhuba, Maxim, Frolov, Evgeny

arXiv.org Machine LearningApr-3-2025

Knowledge Graph Completion with Mixed Geometry Tensor Factorization Viacheslav Yusupov Maxim Rakhuba Evgeny Frolov HSE University HSE University AIRI HSE University Abstract In this paper, we propose a new geometric approach for knowledge graph completion via low rank tensor approximation. We augment a pretrained and well-established Euclidean model based on a Tucker tensor decomposition with a novel hyperbolic interaction term. This correction enables more nuanced capturing of distributional properties in data better aligned with real-world knowledge graphs. By combining two geometries together, our approach improves expressivity of the resulting model achieving new state-of-the-art link prediction accuracy with a significantly lower number of parameters compared to the previous Euclidean and hyperbolic models. 1 INTRODUCTION Most of the information in the world can be expressed in terms of entities and the relationships between them. This information is effectively represented in the form of a knowledge graph (d'Amato, 2021; Peng et al., 2023), which serves as a repository for storing various forms of relational data with their interconnections. Particular examples include storing user profiles on social networking platforms (Xu et al., 2018), organizing Internet resources and the links between them, constructing knowledge bases that capture user preferences to enhance the functionality of recommender systems (Wang et al., 2019a; Guo et al., 2020). With the recent emergence of large language models (LLM), knowledge graphs have become an essential tool for improving the consistency and trustworthiness of linguis-Proceedings of the 28 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2025, Mai Khao, Thailand. Among notable examples of their application are fact checking (Pan et al., 2024), hallucinations mitigation (Agrawal et al., 2023), retrieval-augmented generation (Lewis et al., 2020), and generation of corpus for LLM pretraining (Agarwal et al., 2021). This utilization underscores the versatility and utility of knowledge graphs in managing complex datasets and facilitating the manipulation of interconnected information in various domains and downstream tasks. On the other hand, knowledge graphs may present an incomplete view of the world. Relations can evolve and change over time, be subject to errors, processing limitations, and gaps in available information.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2504.02589

Country: Asia > Thailand (0.24)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Group and Shuffle: Efficient Structured Orthogonal Parametrization

Gorbunov, Mikhail, Yudin, Nikolay, Soboleva, Vera, Alanov, Aibek, Naumov, Alexey, Rakhuba, Maxim

arXiv.org Artificial IntelligenceJun-14-2024

The increasing size of neural networks has led to a growing demand for methods of efficient fine-tuning. Recently, an orthogonal fine-tuning paradigm was introduced that uses orthogonal matrices for adapting the weights of a pretrained model. In this paper, we introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works. We examine properties of this class and build a structured orthogonal parametrization upon it. We then use this parametrization to modify the orthogonal fine-tuning framework, improving parameter and computational efficiency. We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling. Additionally, we adapt our construction for orthogonal convolutions and conduct experiments with 1-Lipschitz neural networks.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.10019

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Towards Practical Control of Singular Values of Convolutional Layers

Senderovich, Alexandra, Bulatova, Ekaterina, Obukhov, Anton, Rakhuba, Maxim

arXiv.org Artificial IntelligenceNov-24-2022

In general, convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control. Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties and offered several methods for controlling them. Nevertheless, these methods present an intractable computational challenge or resort to coarse approximations. In this paper, we offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity. Our method is based on the tensor-train decomposition; it retains control over the actual singular values of convolutional mappings while providing structurally sparse and hardware-friendly representation. We demonstrate the improved properties of modern CNNs with our method and analyze its impact on the model performance, calibration, and adversarial robustness.

artificial intelligence, machine learning, singular value, (17 more...)

arXiv.org Artificial Intelligence

2211.13771

Genre: Research Report > New Finding (0.87)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Spectral Tensor Train Parameterization of Deep Learning Layers

Obukhov, Anton, Rakhuba, Maxim, Liniger, Alexander, Huang, Zhiwu, Georgoulis, Stamatios, Dai, Dengxin, Van Gool, Luc

arXiv.org Machine LearningMar-6-2021

We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. The low-rank property leads to parameter efficiency and permits taking computational shortcuts when computing mappings. Spectral properties are often subject to constraints in optimization problems, leading to better models and stability of optimization. We start by looking at the compact SVD parameterization of weight matrices and identifying redundancy sources in the parameterization. We further apply the Tensor Train (TT) decomposition to the compact SVD components, and propose a non-redundant differentiable parameterization of fixed TT-rank tensor manifolds, termed the Spectral Tensor Train Parameterization (STTP). We demonstrate the effects of neural network compression in the image classification setting and both compression and improved training stability in the generative adversarial training setting.

deep learning, neural network, parameterization, (18 more...)

arXiv.org Machine Learning

2103.04217

Country:

Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

T-Basis: a Compact Representation for Neural Networks

Obukhov, Anton, Rakhuba, Maxim, Georgoulis, Stamatios, Kanakis, Menelaos, Dai, Dengxin, Van Gool, Luc

arXiv.org Machine LearningJul-13-2020

We introduce T-Basis, a novel concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks. Each of the tensors in the set is modeled using Tensor Rings, though the concept applies to other Tensor Networks. Owing its name to the T-shape of nodes in diagram notation of Tensor Rings, T-Basis is simply a list of equally shaped three-dimensional tensors, used to represent Tensor Ring nodes. Such representation allows us to parameterize the tensor set with a small number of parameters (coefficients of the T-Basis tensors), scaling logarithmically with each tensor's size in the set and linearly with the dimensionality of T-Basis. We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops. Finally, we analyze memory and operation requirements of the compressed networks and conclude that T-Basis networks are equally well suited for training and inference in resource-constrained environments and usage on the edge devices.

deep learning, neural network, t-basis, (17 more...)

arXiv.org Machine Learning

2007.06631

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback