AITopics

2408.03834

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Data Science > Data Mining > Text Mining (0.75)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.69)

Sawczyn, Albert, Viarenich, Katsiaryna, Wojtasik, Konrad, Domogała, Aleksandra, Oleksy, Marcin, Piasecki, Maciej, Kajdanowicz, Tomasz

Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction

arXiv.org Artificial IntelligenceAug-5-2024

Advancements in AI and natural language processing have revolutionized machine-human language interactions, with question answering (QA) systems playing a pivotal role. The knowledge base question answering (KBQA) task, utilizing structured knowledge graphs (KG), allows for handling extensive knowledge-intensive questions. However, a significant gap exists in KBQA datasets, especially for low-resource languages. Many existing construction pipelines for these datasets are outdated and inefficient in human labor, and modern assisting tools like Large Language Models (LLM) are not utilized to reduce the workload. To address this, we have designed and implemented a modern, semi-automated approach for creating datasets, encompassing tasks such as KBQA, Machine Reading Comprehension (MRC), and Information Retrieval (IR), tailored explicitly for low-resource environments. We executed this pipeline and introduced the PUGG dataset, the first Polish KBQA dataset, and novel datasets for MRC and IR. Additionally, we provide a comprehensive implementation, insightful findings, detailed statistics, and evaluation of baseline models.

dataset, pipeline, proceedings, (14 more...)

2408.02337

Country:

Europe > Germany (0.04)
Europe > Austria > Vienna (0.04)
Asia > China > Hong Kong (0.04)
(16 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceAug-5-2024

Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering

Guo, Tiezheng, Wang, Chen, Liu, Yanyi, Tang, Jiawei, Li, Pan, Xu, Sai, Yang, Qingwen, Gao, Xianlin, Li, Zhi, Wen, Yingyou

However, Large langugae models (LLM) have acquired superior reading when dealing with complex multi-document question answering comprehension and reasoning capabilities by pretraining on (MDQA) tasks, accurately understanding the question's extensive natural langugae data [1, 2]. They have demonstrated constraints and covering all supporting evidence remains an remarkable performance on a variety of tasks and benchmarks, open challenge [10, 11]. This difficulty arises because previous particularly in the realm of question answering (QA) [3, 4]. Researchers research has treated the relationship between each text chunk are expanding the parameter scale of these models to and the target question in isolation. The retrieval models have enable them to retain more knowledge [5]. However, due to the concentrated solely on whether the main topic of each chunk absence of efficient methods to evaluate or edit their internalized aligns with the question [12]. Imperfect preprocessing can lead knowledge [6], knowledge-intensive tasks remain a major to the incorrect truncation of continuous chunks.

information, keyword, node, (14 more...)

2408.02907

Country:

North America > Sint Maarten > Philipsburg (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)

Setty, Ritvik, Setty, Vinay

QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications

arXiv.org Artificial IntelligenceAug-1-2024

Verifying fact-checking claims poses a significant challenge, even for humans. Recent approaches have demonstrated that decomposing claims into relevant questions to gather evidence enhances the efficiency of the fact-checking process. In this paper, we provide empirical evidence showing that this question decomposition can be effectively automated. We demonstrate that smaller generative models, fine-tuned for the question generation task using data augmentation from various datasets, outperform large language models by up to 8%. Surprisingly, in some cases, the evidence retrieved using machine-generated questions proves to be significantly more effective for fact-checking than that obtained from human-written questions. We also perform manual evaluation of the decomposed questions to assess the quality of the questions generated.

dataset, proceedings, question generation, (14 more...)

doi: 10.1145/3627673.3679985

2407.21441

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Norway > Western Norway > Rogaland > Stavanger (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

arXiv.org Artificial IntelligenceJul-30-2024

Boosting Audio Visual Question Answering via Key Semantic-Aware Cues

Li, Guangyao, Du, Henghui, Hu, Di

The Audio Visual Question Answering (AVQA) task aims to answer questions related to various visual objects, sounds, and their interactions in videos. Such naturally multimodal videos contain rich and complex dynamic audio-visual components, with only a portion of them closely related to the given questions. Hence, effectively perceiving audio-visual cues relevant to the given questions is crucial for correctly answering them. In this paper, we propose a Temporal-Spatial Perception Model (TSPM), which aims to empower the model to perceive key visual and auditory cues related to the questions. Specifically, considering the challenge of aligning non-declarative questions and visual representations into the same semantic space using visual-language pretrained models, we construct declarative sentence prompts derived from the question template, to assist the temporal perception module in better identifying critical segments relevant to the questions. Subsequently, a spatial perception module is designed to merge visual tokens from selected segments to highlight key latent targets, followed by cross-modal interaction with audio to perceive potential sound-aware areas. Finally, the significant temporal-spatial cues from these modules are integrated to answer the question. Extensive experiments on multiple AVQA benchmarks demonstrate that our framework excels not only in understanding audio-visual scenes but also in answering complex questions effectively. Code is available at https://github.com/GeWu-Lab/TSPM.

proceedings, tspm, video, (13 more...)

2407.20693

Country:

Oceania > Australia > Victoria > Melbourne (0.05)
Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.64)

arXiv.org Artificial IntelligenceJul-30-2024

Decomposed Prompting to Answer Questions on a Course Discussion Board

Jaipersaud, Brandon, Zhang, Paul, Ba, Jimmy, Petersen, Andrew, Zhang, Lisa, Zhang, Michael R.

We propose and evaluate a question-answering system that uses decomposed prompting to classify and answer student questions on a course discussion board. Our system uses a large language model (LLM) to classify questions into one of four types: conceptual, homework, logistics, and not answerable. This enables us to employ a different strategy for answering questions that fall under different types. Using a variant of GPT-3, we achieve $81\%$ classification accuracy. We discuss our system's performance on answering conceptual questions from a machine learning course and various failure modes.

accuracy, conceptual question, student question, (14 more...)

doi: 10.1007/978-3-031-36336-8_33

2407.2117

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > Canada > Manitoba > Westman Region > Brandon (0.04)
Europe > United Kingdom > England > Durham > Durham (0.04)

Genre:

Instructional Material (0.70)
Research Report (0.50)

Industry: Education > Educational Setting (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling

Liang, Jianxin, Meng, Xiaojun, Wang, Yueqian, Liu, Chang, Liu, Qun, Zhao, Dongyan

Video Question Answering (VideoQA) has emerged as a challenging frontier in the field of multimedia processing, requiring intricate interactions between visual and textual modalities. Simply uniformly sampling frames or indiscriminately aggregating frame-level visual features often falls short in capturing the nuanced and relevant contexts of videos to well perform VideoQA. To mitigate these issues, we propose VidF4, a novel VideoQA framework equipped with tailored frame selection strategy for effective and efficient VideoQA. We propose three frame-scoring mechanisms that consider both question relevance and inter-frame similarity to evaluate the importance of each frame for a given question on the video. Furthermore, we design a differentiable adaptive frame sampling mechanism to facilitate end-to-end training for the frame selector and answer generator. The experimental results across three widely adopted benchmarks demonstrate that our model consistently outperforms existing VideoQA methods, establishing a new SOTA across NExT-QA (+0.3%), STAR (+0.9%), and TVQA (+1.0%). Furthermore, through both quantitative and qualitative analyses, we validate the effectiveness of each design choice.

mechanism, video frame, vidf4, (13 more...)

2407.15047

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Guda, Bhanu Prakash Reddy, Kulkarni, Tanmay, Sampath, Adithya, Sathyendra, Swarnashree Mysore

Causal Understanding For Video Question Answering

Video Question Answering is a challenging task, which requires the model to reason over multiple frames and understand the interaction between different objects to answer questions based on the context provided within the video, especially in datasets like NExT-QA (Xiao et al., 2021a) which emphasize on causal and temporal questions. Previous approaches leverage either sub-sampled information or causal intervention techniques along with complete video features to tackle the NExT-QA task. In this work we elicit the limitations of these approaches and propose solutions along four novel directions of improvements on theNExT-QA dataset. Our approaches attempts to compensate for the shortcomings in the previous works by systematically attacking each of these problems by smartly sampling frames, explicitly encoding actions and creating interventions that challenge the understanding of the model. Overall, for both single-frame (+6.3%) and complete-video (+1.1%) based approaches, we obtain the state-of-the-art results on NExT-QA dataset.

intervention, representation, video, (15 more...)

2407.20257

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.63)
(2 more...)

Park, Kyu Ri, Lee, Hong Joo, Kim, Jung Uk

Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality

Recent Audio-Visual Question Answering (AVQA) methods rely on complete visual and audio input to answer questions accurately. However, in real-world scenarios, issues such as device malfunctions and data transmission errors frequently result in missing audio or visual modality. In such cases, existing AVQA methods suffer significant performance degradation. In this paper, we propose a framework that ensures robust AVQA performance even when a modality is missing. First, we propose a Relation-aware Missing Modal (RMM) generator with Relation-aware Missing Modal Recalling (RMMR) loss to enhance the ability of the generator to recall missing modal information by understanding the relationships and context among the available modalities. Second, we design an Audio-Visual Relation-aware (AVR) diffusion model with Audio-Visual Enhancing (AVE) loss to further enhance audio-visual features by leveraging the relationships and shared cues between the audio-visual modalities. As a result, our method can provide accurate answers by effectively utilizing available information even when input modalities are missing. We believe our method holds potential applications not only in AVQA research but also in various multi-modal scenarios.

diffusion model, information, modality, (13 more...)

2407.16171

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > South Korea (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Ta, Pralaypati, Gupta, Bhumika, Jain, Arihant, C, Sneha Sree, Ram, Keerthi, Sivaprakasam, Mohanasankar

Knowledge Models for Cancer Clinical Practice Guidelines : Construction, Management and Usage in Question Answering

An automated knowledge modeling algorithm for Cancer Clinical Practice Guidelines (CPGs) extracts the knowledge contained in the CPG documents and transforms it into a programmatically interactable, easy-to-update structured model with minimal human intervention. The existing automated algorithms have minimal scope and cannot handle the varying complexity of the knowledge content in the CPGs for different cancer types. This work proposes an improved automated knowledge modeling algorithm to create knowledge models from the National Comprehensive Cancer Network (NCCN) CPGs in Oncology for different cancer types. The proposed algorithm has been evaluated with NCCN CPGs for four different cancer types. We also proposed an algorithm to compare the knowledge models for different versions of a guideline to discover the specific changes introduced in the treatment protocol of a new version. We created a question-answering (Q&A) framework with the guideline knowledge models as the augmented knowledge base to study our ability to query the knowledge models. We compiled a set of 32 question-answer pairs derived from two reliable data sources for the treatment of Non-Small Cell Lung Cancer (NSCLC) to evaluate the Q&A framework. The framework was evaluated against the question-answer pairs from one data source, and it can generate the answers with 54.5% accuracy from the treatment algorithm and 81.8% accuracy from the discussion part of the NCCN NSCLC guideline knowledge model.

algorithm, guideline, knowledge model, (12 more...)

2407.21053

Country:

Asia > India (0.14)
North America > United States > Florida > Orange County > Orlando (0.04)

Genre:

Instructional Material (0.61)
Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.92)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)