AITopics | Mareček, David

Collaborating Authors

Mareček, David

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation

Limisiewicz, Tomasz, Mareček, David, Musil, Tomáš

arXiv.org Artificial IntelligenceJan-30-2025

Mitigation of biases, such as language models' reliance on gender stereotypes, is a crucial endeavor required for the creation of reliable and useful language technology. The crucial aspect of debiasing is to ensure that the models preserve their versatile capabilities, including their ability to solve language tasks and equitably represent various genders. To address this issue, we introduce a streamlined Dual Dabiasing Algorithm through Model Adaptation (2DAMA). Novel Dual Debiasing enables robust reduction of stereotypical bias while preserving desired factual gender information encoded by language models. We show that 2DAMA effectively reduces gender bias in English and is one of the first approaches facilitating the mitigation of stereotypical tendencies in translation. The proposed method's key advantage is the preservation of factual gender cues, which are useful in a wide range of natural language processing tasks.

artificial intelligence, chatbot, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.1015

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Washington > King County > Seattle (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.83)

Add feedback

Transforming Hidden States into Binary Semantic Features

Musil, Tomáš, Mareček, David

arXiv.org Artificial IntelligenceSep-29-2024

However, with 2. centering the data (setting the mean to zero) the advance of Large Language Models (LLMs), and whitening them (setting variance of each this inspiration has become rather indirect. In this component to 1), paper, we show that distributional theories of meaning can still be relevant in interpreting the hidden 3. iteratively finding directions in the data that states of LLMs and that Independent Component are the most non-Gaussian. Analysis (ICA) can help us overcome some of The last step is based on the assumption of the the challenges associated with understanding these central limit theorem: the mixed signal is a sum complex models.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.19813

Country: Europe > Czechia (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Media > Music (0.95)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Debiasing Algorithm through Model Adaptation

Limisiewicz, Tomasz, Mareček, David, Musil, Tomáš

arXiv.org Machine LearningOct-29-2023

Large language models are becoming the go-to solution for various language tasks. However, with growing capacity, models are prone to rely on spurious correlations stemming from biases and stereotypes present in the training data. This work proposes a novel method for detecting and mitigating gender bias in language models. We perform causal analysis to identify problematic model components and discover that mid-upper feed-forward layers are most prone to convey biases. Based on the analysis results, we adapt the model by multiplying these layers by a linear projection. Our titular method, DAMA, significantly decreases bias as measured by diverse metrics while maintaining the model's performance on downstream tasks. We release code for our method and models, which retrain LLaMA's state-of-the-art performance while being significantly less biased.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2310.18913

Country:

Europe (1.00)
North America > United States > Washington > King County > Seattle (0.28)
North America > United States > California (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)

Add feedback

Exploring the Impact of Training Data Distribution and Subword Tokenization on Gender Bias in Machine Translation

Iluz, Bar, Limisiewicz, Tomasz, Stanovsky, Gabriel, Mareček, David

arXiv.org Artificial IntelligenceSep-30-2023

We study the effect of tokenization on gender bias in machine translation, an aspect that has been largely overlooked in previous works. Specifically, we focus on the interactions between the frequency of gendered profession names in training data, their representation in the subword tokenizer's vocabulary, and gender bias. We observe that female and non-stereotypical gender inflections of profession names (e.g., Spanish "doctora" for "female doctor") tend to be split into multiple subword tokens. Our results indicate that the imbalance of gender forms in the model's training corpus is a major factor contributing to gender bias and has a greater impact than subword splitting. We show that analyzing subword splits provides good estimates of gender-form imbalance in the training data and can be used even when the corpus is not publicly available. We also demonstrate that fine-tuning just the token embedding layer can decrease the gap in gender prediction accuracy between female and male forms without impairing the translation quality.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2309.12491

Country:

Europe (1.00)
North America > United States (0.68)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Closing the loop: Autonomous experiments enabled by machine-learning-based online data analysis in synchrotron beamline environments

Pithan, Linus, Starostin, Vladimir, Mareček, David, Petersdorf, Lukas, Völter, Constantin, Munteanu, Valentin, Jankowski, Maciej, Konovalov, Oleg, Gerlach, Alexander, Hinderhofer, Alexander, Murphy, Bridget, Kowarik, Stefan, Schreiber, Frank

arXiv.org Artificial IntelligenceJun-20-2023

Recently, there has been significant interest in applying machine learning (ML) techniques to X-ray scattering experiments, which proves to be a valuable tool for enhancing research that involves large or rapidly generated datasets. ML allows for the automated interpretation of experimental results, particularly those obtained from synchrotron or neutron facilities. The speed at which ML models can process data presents an important opportunity to establish a closed-loop feedback system, enabling real-time decision-making based on online data analysis. In this study, we describe the incorporation of ML into a closed-loop workflow for X-ray reflectometry (XRR), using the growth of organic thin films as an example. Our focus lies on the beamline integration of ML-based online data analysis and closed-loop feedback. We present solutions that provide an elementary data analysis in real time during the experiment without introducing the additional software dependencies in the beamline control software environment. Our data demonstrates the accuracy and robustness of ML methods for analyzing XRR curves and Bragg reflections and its autonomous control over a vacuum deposition setup.

artificial intelligence, machine learning, real time system, (18 more...)

arXiv.org Artificial Intelligence

2306.11899

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages

Limisiewicz, Tomasz, Balhar, Jiří, Mareček, David

arXiv.org Artificial IntelligenceMay-26-2023

Multilingual language models have recently gained attention as a promising solution for representing multiple languages in a single model. In this paper, we propose new criteria to evaluate the quality of lexical representation and vocabulary overlap observed in sub-word tokenizers. Our findings show that the overlap of vocabulary across languages can be actually detrimental to certain downstream tasks (POS, dependency tree labeling). In contrast, NER and sentence-level tasks (cross-lingual retrieval, NLI) benefit from sharing vocabulary. We also observe that the coverage of the language-specific tokens in the multilingual vocabulary significantly impacts the word-level tasks. Our study offers a deeper understanding of the role of tokenizers in multilingual language models and guidelines for future model developers to choose the most suitable tokenizer for their specific application before undertaking costly model pre-training

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.17179

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Independent Components of Word Embeddings Represent Semantic Features

Musil, Tomáš, Mareček, David

arXiv.org Artificial IntelligenceDec-19-2022

Independent Component Analysis (ICA) is an algorithm originally developed for finding separate sources in a mixed signal, such as a recording of multiple people in the same room speaking at the same time. It has also been used to find linguistic features in distributional representations. In this paper, we used ICA to analyze words embeddings. We have found that ICA can be used to find semantic features of the words and these features can easily be combined to search for words that satisfy the combination. We show that only some of the independent components represent such features, but those that do are stable with regard to random initialization of the algorithm.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.0958

Country:

Europe (0.94)
Asia (0.68)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes

Limisiewicz, Tomasz, Mareček, David

arXiv.org Artificial IntelligenceSep-10-2021

State-of-the-art contextual embeddings are obtained from large language models available only for a few languages. For others, we need to learn representations using a multilingual model. There is an ongoing debate on whether multilingual embeddings can be aligned in a space shared across many languages. The novel Orthogonal Structural Probe (Limisiewicz and Mare\v{c}ek, 2021) allows us to answer this question for specific linguistic features and learn a projection based only on mono-lingual annotated datasets. We evaluate syntactic (UD) and lexical (WordNet) structural information encoded inmBERT's contextual representations for nine diverse languages. We observe that for languages closely related to English, no transformation is needed. The evaluated information is encoded in a shared cross-lingual embedding space. For other languages, it is beneficial to apply orthogonal transformation learned separately for each language. We successfully apply our findings to zero-shot and few-shot cross-lingual parsing.

artificial intelligence, computational linguistics, natural language, (14 more...)

arXiv.org Artificial Intelligence

2109.04921

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.35)

Add feedback