AITopics | Abdullaeva, Irina

Collaborating Authors

Abdullaeva, Irina

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding

Li, Pengyi, Abdullaeva, Irina, Gambashidze, Alexander, Kuznetsov, Andrey, Oseledets, Ivan

arXiv.org Artificial IntelligenceFeb-5-2025

Modern Video Large Language Models (VLLMs) often rely on uniform frame sampling for video understanding, but this approach frequently fails to capture critical information due to frame redundancy and variations in video content. We propose MaxInfo, a training-free method based on the maximum volume principle, which selects and retains the most representative frames from the input video. By maximizing the geometric volume formed by selected embeddings, MaxInfo ensures that the chosen frames cover the most informative regions of the embedding space, effectively reducing redundancy while preserving diversity. This method enhances the quality of input representations and improves long video comprehension performance across benchmarks. For instance, MaxInfo achieves a 3.28% improvement on LongVideoBench and a 6.4% improvement on EgoSchema for LLaVA-Video-7B. It also achieves a 3.47% improvement for LLaVA-Video-72B. The approach is simple to implement and works with existing VLLMs without the need for additional training, making it a practical and effective alternative to traditional uniform sampling methods.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.03183

Country:

Europe > Russia (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ESQA: Event Sequences Question Answering

Abdullaeva, Irina, Filatov, Andrei, Orlov, Mikhail, Karpukhin, Ivan, Vasilev, Viacheslav, Dimitrov, Denis, Kuznetsov, Andrey, Kireev, Ivan, Savchenko, Andrey

arXiv.org Artificial IntelligenceJul-3-2024

Event sequences (ESs) arise in many practical domains including finance, retail, social networks, and healthcare. In the context of machine learning, event sequences can be seen as a special type of tabular data with annotated timestamps. Despite the importance of ESs modeling and analysis, little effort was made in adapting large language models (LLMs) to the ESs domain. In this paper, we highlight the common difficulties of ESs processing and propose a novel solution capable of solving multiple downstream tasks with little or no finetuning. In particular, we solve the problem of working with long sequences and improve time and numeric features processing. The resulting method, called ESQA, effectively utilizes the power of LLMs and, according to extensive experiments, achieves state-of-the-art results in the ESs domain.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2407.12833

Country: Europe > Russia (0.15)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry:

Banking & Finance (1.00)
Information Technology > Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

OmniFusion Technical Report

Goncharova, Elizaveta, Razzhigaev, Anton, Mikhalchuk, Matvey, Kurkin, Maxim, Abdullaeva, Irina, Skripkin, Matvey, Oseledets, Ivan, Dimitrov, Denis, Kuznetsov, Andrey

arXiv.org Artificial IntelligenceApr-9-2024

In recent years, multimodal architectures emerged as a powerful paradigm for enhancing artificial intelligence (AI) systems, enabling them to process and understand multiple types of data simultaneously [1, 2, 3]. The integration of different data modalities, such as text and images, has significantly improved the capabilities of large language models (LLMs) in various tasks, ranging from visual question answering (VQA) [4] to complex decision-making processes [5, 6]. However, the challenge of effectively coupling various data types remains a significant obstacle in the development of truly integrative AI models. Furthermore, such multimodal multitask architectures are interpreted as the first steps towards the development of the artificial general intelligence (AGI), expanding the number of challenges in world cognition. This work introduces the OmniFusion model, a novel multimodal architecture that leverages the strengths of pretrained LLMs and introduces specialized adapters for processing visual information.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.06212

Country: Asia (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback