AITopics | Tabor, Jacek

Collaborating Authors

Tabor, Jacek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

Bałazy, Klaudia, Banaei, Mohammadreza, Aberer, Karl, Tabor, Jacek

arXiv.org Artificial IntelligenceMay-27-2024

The recent trend in scaling language models has led to a growing demand for parameter-efficient tuning (PEFT) methods such as LoRA (Low-Rank Adaptation). LoRA consistently matches or surpasses the full fine-tuning baseline with fewer parameters. However, handling numerous task-specific or user-specific LoRA modules on top of a base model still presents significant storage challenges. To address this, we introduce LoRA-XS (Low-Rank Adaptation with eXtremely Small number of parameters), a novel approach leveraging Singular Value Decomposition (SVD) for parameter-efficient fine-tuning. LoRA-XS introduces a small r x r weight matrix between frozen LoRA matrices, which are constructed by SVD of the original weight matrix. Training only r x r weight matrices ensures independence from model dimensions, enabling more parameter-efficient fine-tuning, especially for larger models. LoRA-XS achieves a remarkable reduction of trainable parameters by over 100x in 7B models compared to LoRA. Our benchmarking across various scales, including GLUE, GSM8k, and MATH benchmarks, shows that our approach outperforms LoRA and recent state-of-the-art approaches like VeRA in terms of parameter efficiency while maintaining competitive performance.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.17604

Country: Europe (0.14)

Genre: Research Report > Promising Solution (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision

Pach, Mateusz, Rymarczyk, Dawid, Lewandowska, Koryna, Tabor, Jacek, Zieliński, Bartosz

arXiv.org Artificial IntelligenceMay-23-2024

Prototypical parts networks combine the power of deep learning with the explainability of case-based reasoning to make accurate, interpretable decisions. They follow the this looks like that reasoning, representing each prototypical part with patches from training images. However, a single image patch comprises multiple visual features, such as color, shape, and texture, making it difficult for users to identify which feature is important to the model. To reduce this ambiguity, we introduce the Lucid Prototypical Parts Network (LucidPPN), a novel prototypical parts network that separates color prototypes from other visual features. Our method employs two reasoning branches: one for non-color visual features, processing grayscale images, and another focusing solely on color information. This separation allows us to clarify whether the model's decisions are based on color, shape, or texture. Additionally, LucidPPN identifies prototypical parts corresponding to semantic parts of classified objects, making comparisons between data classes more intuitive, e.g., when two bird species might differ primarily in belly color. Our experiments demonstrate that the two branches are complementary and together achieve results comparable to baseline methods. More importantly, LucidPPN generates less ambiguous prototypical parts, enhancing user understanding.

artificial intelligence, image understanding, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2405.14331

Country:

Europe > Poland (0.15)
Europe > Portugal (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProPML: Probability Partial Multi-label Learning

Struski, Łukasz, Pardyl, Adam, Tabor, Jacek, Zieliński, Bartosz

arXiv.org Artificial IntelligenceMar-12-2024

Abstract--Partial Multi-label Learning (PML) is a type of weakly supervised learning where each training instance corresponds to a set of candidate labels, among which only some are true. ProPML outperforms existing approaches, especially for high noise in a candidate set. Pineapple Deep neural networks are highly effective in many practical applications. However, their success is heavily dependent on the availability of a large dataset with accurate labeling. Figure 1: In partial multiple-label learning, each training instance Obtaining such datasets is challenging due to the cost and corresponds to a set of candidate labels.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.07603

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Augmentation-aware Self-supervised Learning with Conditioned Projector

Przewięźlikowski, Marcin, Pyla, Mateusz, Zieliński, Bartosz, Twardowski, Bartłomiej, Tabor, Jacek, Śmieja, Marek

arXiv.org Artificial IntelligenceDec-2-2023

Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.

artificial intelligence, machine learning, result, (18 more...)

arXiv.org Artificial Intelligence

2306.06082

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training

Nowak, Aleksandra I., Grooten, Bram, Mocanu, Decebal Constantin, Tabor, Jacek

arXiv.org Machine LearningNov-29-2023

Dynamic Sparse Training (DST) is a rapidly evolving area of research that seeks to optimize the sparse initialization of a neural network by adapting its topology during training. It has been shown that under specific conditions, DST is able to outperform dense models. The key components of this framework are the pruning and growing criteria, which are repeatedly applied during the training process to adjust the network's sparse connectivity. While the growing criterion's impact on DST performance is relatively well studied, the influence of the pruning criterion remains overlooked. To address this issue, we design and perform an extensive empirical analysis of various pruning criteria to better understand their impact on the dynamics of DST solutions. Surprisingly, we find that most of the studied methods yield similar results. The differences become more significant in the low-density regime, where the best performance is predominantly given by the simplest technique: magnitude-based pruning.

artificial intelligence, machine learning, pruning criteria, (18 more...)

arXiv.org Machine Learning

2306.1223

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

LocoGAN -- Locally Convolutional GAN

Struski, Łukasz, Knop, Szymon, Tabor, Jacek, Daniec, Wiktor, Spurek, Przemysław

arXiv.org Artificial IntelligenceNov-2-2023

We add extra channels with spatial information to the input noise images. In the paper we construct a fully convolutional GAN model: LocoGAN, which latent space is Such architecture and design of latent space allows us to given by noise-like images of possibly different use an input of various dimensions. We use that to train our resolutions. The learning is local, i.e. we process model only on parts of the latent image, see Figure 1. We call not the whole noise-like image, but the subimages this approach local learning. Section 3 contains the detailed of a fixed size.

artificial intelligence, machine learning, resolution, (17 more...)

arXiv.org Artificial Intelligence

2002.07897

Country: Europe > Poland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

Face Identity-Aware Disentanglement in StyleGAN

Suwała, Adrian, Wójcik, Bartosz, Proszewska, Magdalena, Tabor, Jacek, Spurek, Przemysław, Śmieja, Marek

arXiv.org Artificial IntelligenceSep-21-2023

Conditional GANs are frequently used for manipulating the attributes of face images, such as expression, hairstyle, pose, or age. Even though the state-of-the-art models successfully modify the requested attributes, they simultaneously modify other important characteristics of the image, such as a person's identity. In this paper, we focus on solving this problem by introducing PluGeN4Faces, a plugin to StyleGAN, which explicitly disentangles face attributes from a person's identity. Our key idea is to perform training on images retrieved from movie frames, where a given person appears in various poses and with different attributes. By applying a type of contrastive loss, we encourage the model to group images of the same person in similar regions of latent space. Our experiments demonstrate that the modifications of face attributes performed by PluGeN4Faces are significantly less invasive on the remaining characteristics of the image than in the existing state-of-the-art models.

artificial intelligence, machine learning, plugen4face, (18 more...)

arXiv.org Artificial Intelligence

2309.12033

Genre: Research Report (0.90)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)

Add feedback

Hypernetwork approach to Bayesian MAML

Borycki, Piotr, Kubacki, Piotr, Przewięźlikowski, Marcin, Kuśmierczyk, Tomasz, Tabor, Jacek, Spurek, Przemysław

arXiv.org Artificial IntelligenceAug-30-2023

The main goal of Few-Shot learning algorithms is to enable learning from small amounts of data. One of the most popular and elegant Few-Shot learning approaches is Model-Agnostic Meta-Learning (MAML). The main idea behind this method is to learn the shared universal weights of a meta-model, which are then adapted for specific tasks. However, the method suffers from over-fitting and poorly quantifies uncertainty due to limited data size. Bayesian approaches could, in principle, alleviate these shortcomings by learning weight distributions in place of point-wise weights. Unfortunately, previous modifications of MAML are limited due to the simplicity of Gaussian posteriors, MAML-like gradient-based weight updates, or by the same structure enforced for universal and adapted weights. In this paper, we propose a novel framework for Bayesian MAML called BayesianHMAML, which employs Hypernetworks for weight updates. It learns the universal weights point-wise, but a probabilistic structure is added when adapted for specific tasks. In such a framework, we can use simple Gaussian distributions or more complicated posteriors induced by Continuous Normalizing Flows.

artificial intelligence, bayesianhmaml, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2210.02796

Country: Europe > Poland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Sacha, Mikołaj, Jura, Bartosz, Rymarczyk, Dawid, Struski, Łukasz, Tabor, Jacek, Zieliński, Bartosz

arXiv.org Artificial IntelligenceAug-16-2023

Prototypical parts-based networks are becoming increasingly popular due to their faithful self-explanations. However, their similarity maps are calculated in the penultimate network layer. Therefore, the receptive field of the prototype activation region often depends on parts of the image outside this region, which can lead to misleading interpretations. We name this undesired behavior a spatial explanation misalignment and introduce an interpretability benchmark with a set of dedicated metrics for quantifying this phenomenon. In addition, we propose a method for misalignment compensation and apply it to existing state-of-the-art models. We show the expressiveness of our benchmark and the effectiveness of the proposed compensation methodology through extensive empirical studies.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.08162

Country: Europe (0.67)

Genre: Research Report (0.84)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ChiENN: Embracing Molecular Chirality with Graph Neural Networks

Gaiński, Piotr, Koziarski, Michał, Tabor, Jacek, Śmieja, Marek

arXiv.org Artificial IntelligenceJul-10-2023

Graph Neural Networks (GNNs) play a fundamental role in many deep learning problems, in particular in cheminformatics. However, typical GNNs cannot capture the concept of chirality, which means they do not distinguish between the 3D graph of a chemical compound and its mirror image (enantiomer). The ability to distinguish between enantiomers is important especially in drug discovery because enantiomers can have very distinct biochemical properties. In this paper, we propose a theoretically justified message-passing scheme, which makes GNNs sensitive to the order of node neighbors. We apply that general concept in the context of molecular chirality to construct Chiral Edge Neural Network (ChiENN) layer which can be appended to any GNN model to enable chirality-awareness. Our experiments show that adding ChiENN layers to a GNN outperforms current state-of-the-art methods in chiral-sensitive molecular property prediction tasks.

artificial intelligence, chirality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.02198

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback