AITopics | Braslavski, Pavel

Collaborating Authors

Braslavski, Pavel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

Baranov, Alexander, Palatkina, Anna, Makovka, Yulia, Braslavski, Pavel

arXiv.org Artificial IntelligenceMar-3-2025

We present KoWit-24, a dataset with fine-grained annotation of wordplay in 2,700 Russian news headlines. KoWit-24 annotations include the presence of wordplay, its type, wordplay anchors, and words/phrases the wordplay refers to. Unlike the majority of existing humor collections of canned jokes, KoWit-24 provides wordplay contexts -- each headline is accompanied by the news lead and summary. The most common type of wordplay in the dataset is the transformation of collocations, idioms, and named entities -- the mechanism that has been underrepresented in previous humor datasets. Our experiments with five LLMs show that there is ample room for improvement in wordplay detection and interpretation tasks. The dataset and evaluation scripts are available at https://github.com/Humor-Research/KoWit-24

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.0151

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Pletenev, Sergey, Marina, Maria, Moskovskiy, Daniil, Konovalov, Vasily, Braslavski, Pavel, Panchenko, Alexander, Salnikov, Mikhail

arXiv.org Artificial IntelligenceFeb-25-2025

The performance of Large Language Models (LLMs) on many tasks is greatly limited by the knowledge learned during pre-training and stored in the model's parameters. Low-rank adaptation (LoRA) is a popular and efficient training technique for updating or domain-specific adaptation of LLMs. In this study, we investigate how new facts can be incorporated into the LLM using LoRA without compromising the previously learned knowledge. We fine-tuned Llama-3.1-8B-instruct using LoRA with varying amounts of new knowledge. Our experiments have shown that the best results are obtained when the training data contains a mixture of known and new facts. However, this approach is still potentially harmful because the model's performance on external question-answering benchmarks declines after such fine-tuning. When the training data is biased towards certain entities, the model tends to regress to few overrepresented answers. In addition, we found that the model becomes more confident and refuses to provide an answer in only few cases. These findings highlight the potential pitfalls of LoRA-based LLM updates and underscore the importance of training data composition and tuning parameters to balance new knowledge integration and general model capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.14502

Country:

Europe (0.93)
North America > United States > Florida > Miami-Dade County > Miami (0.14)
North America > Mexico > Mexico City (0.14)
Asia > Middle East > Iran (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Konstruktor: A Strong Baseline for Simple Knowledge Graph Question Answering

Lysyuk, Maria, Salnikov, Mikhail, Braslavski, Pavel, Panchenko, Alexander

arXiv.org Artificial IntelligenceSep-24-2024

While being one of the most popular question types, simple questions such as "Who is the author of Cinderella?", are still not completely solved. Surprisingly, even the most powerful modern Large Language Models are prone to errors when dealing with such questions, especially when dealing with rare entities. At the same time, as an answer may be one hop away from the question entity, one can try to develop a method that uses structured knowledge graphs (KGs) to answer such questions. In this paper, we introduce Konstruktor - an efficient and robust approach that breaks down the problem into three steps: (i) entity extraction and entity linking, (ii) relation prediction, and (iii) querying the knowledge graph. Our approach integrates language models and knowledge graphs, exploiting the power of the former and the interpretability of the latter. We experiment with two named entity recognition and entity linking methods and several relation detection techniques. We show that for relation detection, the most challenging step of the workflow, a combination of relation classification/generation and ranking outperforms other methods. We report Konstruktor's strong results on four datasets.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-70242-6_11

2409.15902

Country:

North America > United States > Louisiana (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (0.94)
Workflow (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

KazQAD: Kazakh Open-Domain Question Answering Dataset

Yeshpanov, Rustem, Efimov, Pavel, Boytsov, Leonid, Shalkarbayuli, Ardak, Braslavski, Pavel

arXiv.org Artificial IntelligenceApr-5-2024

We introduce KazQAD -- a Kazakh open-domain question answering (ODQA) dataset -- that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments. KazQAD contains just under 6,000 unique questions with extracted short answers and nearly 12,000 passage-level relevance judgements. We use a combination of machine translation, Wikipedia search, and in-house manual annotation to ensure annotation efficiency and data quality. The questions come from two sources: translated items from the Natural Questions (NQ) dataset (only for training) and the original Kazakh Unified National Testing (UNT) exam (for development and testing). The accompanying text corpus contains more than 800,000 passages from the Kazakh Wikipedia. As a supplementary dataset, we release around 61,000 question-passage-answer triples from the NQ dataset that have been machine-translated into Kazakh. We develop baseline retrievers and readers that achieve reasonable scores in retrieval (NDCG@10 = 0.389 MRR = 0.382), reading comprehension (EM = 38.5 F1 = 54.2), and full ODQA (EM = 17.8 F1 = 28.7) settings. Nevertheless, these results are substantially lower than state-of-the-art results for English QA collections, and we think that there should still be ample room for improvement. We also show that the current OpenAI's ChatGPTv3.5 is not able to answer KazQAD test questions in the closed-book setting with acceptable quality. The dataset is freely available under the Creative Commons licence (CC BY-SA) at https://github.com/IS2AI/KazQAD.

large language model, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2404.04487

Country:

Europe (1.00)
Asia > Kazakhstan (0.69)
Asia > Middle East > UAE (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Information Technology (0.88)
Education (0.88)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)

Add feedback

The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

Efimov, Pavel, Boytsov, Leonid, Arslanova, Elena, Braslavski, Pavel

arXiv.org Artificial IntelligenceOct-31-2023

Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. (2020) proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a typologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words (across languages) showed that fine-tuning leads to "forgetting" some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-28241-6_4

2204.06457

Country:

Asia (0.46)
Europe > Russia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Answer Candidate Type Selection: Text-to-Text Language Model for Closed Book Question Answering Meets Knowledge Graphs

Salnikov, Mikhail, Lysyuk, Maria, Braslavski, Pavel, Razzhigaev, Anton, Malykh, Valentin, Panchenko, Alexander

arXiv.org Artificial IntelligenceOct-10-2023

Pre-trained Text-to-Text Language Models (LMs), such as T5 or BART yield promising results in the Knowledge Graph Question Answering (KGQA) task. However, the capacity of the models is limited and the quality decreases for questions with less popular entities. In this paper, we present a novel approach which works on top of the pre-trained Text-to-Text QA system to address this issue. Our simple yet effective method performs filtering and re-ranking of generated candidates based on their types derived from Wikidata "instance_of" property.

answer candidate type selection, artificial intelligence, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.07008

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.60)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback

Large Language Models Meet Knowledge Graphs to Answer Factoid Questions

Salnikov, Mikhail, Le, Hai, Rajput, Prateek, Nikishina, Irina, Braslavski, Pavel, Malykh, Valentin, Panchenko, Alexander

arXiv.org Artificial IntelligenceOct-3-2023

Recently, it has been shown that the incorporation of structured knowledge into Large Language Models significantly improves the results for a variety of NLP tasks. In this paper, we propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs for answering factoid questions. More specifically, we propose an algorithm for subgraphs extraction from a Knowledge Graph based on question entities and answer candidates. Then, we procure easily interpreted information with Transformer-based models through the linearization of the extracted subgraphs. Final re-ranking of the answer candidates with the extracted information boosts Hits@1 scores of the pre-trained text-to-text language models by 4-6%.

large language model, machine learning, subgraph, (18 more...)

arXiv.org Artificial Intelligence

2310.02166

Country:

North America > United States > Texas (0.14)
North America > United States > Maryland (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities

Loukachevitch, Natalia, Manandhar, Suresh, Baral, Elina, Rozhkov, Igor, Braslavski, Pavel, Ivanov, Vladimir, Batura, Tatiana, Tutubalina, Elena

arXiv.org Artificial IntelligenceOct-21-2022

This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English. NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types. NEREL-BIO annotation scheme covers both general and biomedical domains making it suitable for domain transfer experiments. NEREL-BIO provides annotation for nested named entities as an extension of the scheme employed for NEREL. Nested named entities may cross entity boundaries to connect to shorter entities nested within longer entities, making them harder to detect. NEREL-BIO contains annotations for 700+ Russian and 100+ English abstracts. All English PubMed annotations have corresponding Russian counterparts. Thus, NEREL-BIO comprises the following specific features: annotation of nested named entities, it can be used as a benchmark for cross-domain (NEREL -> NEREL-BIO) and cross-language (English -> Russian) transfer. We experiment with both transformer-based sequence models and machine reading comprehension (MRC) models and report their results. The dataset is freely available at https://github.com/nerel-ds/NEREL-BIO.

artificial intelligence, entity type, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/bioinformatics/btad161

2210.11913

Country:

Europe (0.48)
Asia > Russia (0.29)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback