AITopics | Moskovskiy, Daniil

Collaborating Authors

Moskovskiy, Daniil

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Pletenev, Sergey, Marina, Maria, Moskovskiy, Daniil, Konovalov, Vasily, Braslavski, Pavel, Panchenko, Alexander, Salnikov, Mikhail

arXiv.org Artificial IntelligenceFeb-25-2025

The performance of Large Language Models (LLMs) on many tasks is greatly limited by the knowledge learned during pre-training and stored in the model's parameters. Low-rank adaptation (LoRA) is a popular and efficient training technique for updating or domain-specific adaptation of LLMs. In this study, we investigate how new facts can be incorporated into the LLM using LoRA without compromising the previously learned knowledge. We fine-tuned Llama-3.1-8B-instruct using LoRA with varying amounts of new knowledge. Our experiments have shown that the best results are obtained when the training data contains a mixture of known and new facts. However, this approach is still potentially harmful because the model's performance on external question-answering benchmarks declines after such fine-tuning. When the training data is biased towards certain entities, the model tends to regress to few overrepresented answers. In addition, we found that the model becomes more confident and refuses to provide an answer in only few cases. These findings highlight the potential pitfalls of LoRA-based LLM updates and underscore the importance of training data composition and tuning parameters to balance new knowledge integration and general model capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.14502

Country:

Europe (0.93)
North America > United States > Florida > Miami-Dade County > Miami (0.14)
North America > Mexico > Mexico City (0.14)
Asia > Middle East > Iran (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Moskovskiy, Daniil, Sushko, Nikita, Pletenev, Sergey, Tutubalina, Elena, Panchenko, Alexander

arXiv.org Artificial IntelligenceFeb-10-2025

Existing approaches to multilingual text detoxification are hampered by the scarcity of parallel multilingual datasets. In this work, we introduce a pipeline for the generation of multilingual parallel detoxification data. We also introduce SynthDetoxM, a manually collected and synthetically generated multilingual parallel text detoxification dataset comprising 16,000 high-quality detoxification sentence pairs across German, French, Spanish and Russian. The data was sourced from different toxicity evaluation datasets and then rewritten with nine modern open-source LLMs in few-shot setting. Our experiments demonstrate that models trained on the produced synthetic datasets have superior performance to those trained on the human-annotated MultiParaDetox dataset even in data limited setting. Models trained on SynthDetoxM outperform all evaluated LLMs in few-shot setting. We release our dataset and code to help further research in multilingual text detoxification.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.06394

Country:

Asia (1.00)
North America > United States > Louisiana (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.81)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Multilingual and Explainable Text Detoxification with Parallel Corpora

Dementieva, Daryna, Babakov, Nikolay, Ronen, Amit, Ayele, Abinew Ali, Rizwan, Naquee, Schneider, Florian, Wang, Xintong, Yimam, Seid Muhie, Moskovskiy, Daniil, Stakovskii, Elisei, Kaufman, Eran, Elnagar, Ashraf, Mukherjee, Animesh, Panchenko, Alexander

arXiv.org Artificial IntelligenceDec-16-2024

Even with various regulations in place across countries and social media platforms (Government of India, 2021; European Parliament and Council of the European Union, 2022, digital abusive speech remains a significant issue. One potential approach to address this challenge is automatic text detoxification, a text style transfer (TST) approach that transforms toxic language into a more neutral or non-toxic form. To date, the availability of parallel corpora for the text detoxification task (Logachevavet al., 2022; Atwell et al., 2022; Dementievavet al., 2024a) has proven to be crucial for state-of-the-art approaches. With this work, we extend parallel text detoxification corpus to new languages -- German, Chinese, Arabic, Hindi, and Amharic -- testing in the extensive multilingual setup TST baselines. Next, we conduct the first of its kind an automated, explainable analysis of the descriptive features of both toxic and non-toxic sentences, diving deeply into the nuances, similarities, and differences of toxicity and detoxification across 9 languages. Finally, based on the obtained insights, we experiment with a novel text detoxification method inspired by the Chain-of-Thoughts reasoning approach, enhancing the prompting process through clustering on relevant descriptive attributes.

computational linguistic, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2412.11691

Country:

Europe (1.00)
Asia > India (0.68)
North America > United States > Minnesota (0.28)

Genre:

Research Report (0.84)
Overview > Innovation (0.34)

Industry: Government > Regional Government > Europe Government (0.54)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MERA: A Comprehensive LLM Evaluation in Russian

Fenogenova, Alena, Chervyakov, Artem, Martynov, Nikita, Kozlova, Anastasia, Tikhonova, Maria, Akhmetgareeva, Albina, Emelyanov, Anton, Shevelev, Denis, Lebedev, Pavel, Sinev, Leonid, Isaeva, Ulyana, Kolomeytseva, Katerina, Moskovskiy, Daniil, Goncharova, Elizaveta, Savushkin, Nikita, Mikhailova, Polina, Dimitrov, Denis, Panchenko, Alexander, Markov, Sergei

arXiv.org Artificial IntelligenceJan-12-2024

Over the past few years, one of the most notable advancements in AI research has been in foundation models (FMs), headlined by the rise of language models (LMs). As the models' size increases, LMs demonstrate enhancements in measurable aspects and the development of new qualitative features. However, despite researchers' attention and the rapid growth in LM application, the capabilities, limitations, and associated risks still need to be better understood. To address these issues, we introduce an open Multimodal Evaluation of Russian-language Architectures (MERA), a new instruction benchmark for evaluating foundation models oriented towards the Russian language. The benchmark encompasses 21 evaluation tasks for generative models in 11 skill domains and is designed as a black-box test to ensure the exclusion of data leakage. The paper introduces a methodology to evaluate FMs and LMs in zero- and few-shot fixed instruction settings that can be extended to other modalities. We propose an evaluation methodology, an open-source code base for the MERA assessment, and a leaderboard with a submission system. We evaluate open LMs as baselines and find that they are still far behind the human level. We publicly release MERA to guide forthcoming research, anticipate groundbreaking model features, standardize the evaluation procedure, and address potential societal drawbacks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2401.04531

Country:

Europe (1.00)
Asia (0.92)
North America > United States > New York (0.14)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

Add feedback

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

Dementieva, Daryna, Moskovskiy, Daniil, Dale, David, Panchenko, Alexander

arXiv.org Artificial IntelligenceNov-23-2023

Text detoxification is the task of transferring the style of text from toxic to neutral. While here are approaches yielding promising results in monolingual setup, e.g., (Dale et al., 2021; Hallinan et al., 2022), cross-lingual transfer for this task remains a challenging open problem (Moskovskiy et al., 2022). In this work, we present a large-scale study of strategies for cross-lingual text detoxification -- given a parallel detoxification corpus for one language; the goal is to transfer detoxification ability to another language for which we do not have such a corpus. Moreover, we are the first to explore a new task where text translation and detoxification are performed simultaneously, providing several strong baselines for this task. Finally, we introduce new automatic detoxification evaluation metrics with higher correlations with human judgments than previous benchmarks. We assess the most promising approaches also with manual markup, determining the answer for the best strategy to transfer the knowledge of text detoxification between languages.

detoxification, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.13937

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Add feedback