AITopics

2407.16931

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England (0.04)
Europe > France (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.40)

Industry:

Education (0.67)
Materials > Chemicals (0.46)
Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceJul-22-2024

KaPQA: Knowledge-Augmented Product Question-Answering

Eppalapally, Swetha, Dangi, Daksh, Bhat, Chaithra, Gupta, Ankita, Zhang, Ruiyi, Agarwal, Shubham, Bagga, Karishma, Yoon, Seunghyun, Lipka, Nedim, Rossi, Ryan A., Dernoncourt, Franck

Question-answering for domain-specific applications has recently attracted much interest due to the latest advancements in large language models (LLMs). However, accurately assessing the performance of these applications remains a challenge, mainly due to the lack of suitable benchmarks that effectively simulate real-world scenarios. To address this challenge, we introduce two product question-answering (QA) datasets focused on Adobe Acrobat and Photoshop products to help evaluate the performance of existing models on domain-specific product QA tasks. Additionally, we propose a novel knowledge-driven RAG-QA framework to enhance the performance of the models in the product QA task. Our experiments demonstrated that inducing domain knowledge through query reformulation allowed for increased retrieval and generative performance when compared to standard RAG-QA methods. This improvement, however, is slight, and thus illustrates the challenge posed by the datasets introduced.

dataset, knowledge augmented method, query, (15 more...)

2407.16073

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Dominican Republic (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.46)
Banking & Finance (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Kleinle, Steffen, Prange, Jakob, Friedrich, Annemarie

OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context

arXiv.org Artificial IntelligenceJul-22-2024

When immigrating to a new country, it is easy to feel overwhelmed by the need to obtain information on financial support, housing, schooling, language courses, and other issues. If relocation is rushed or even forced, the necessity for high-quality answers to such questions is all the more urgent. Official immigration counselors are usually overbooked, and online systems could guide newcomers to the requested information or a suitable counseling service. To this end, we present OMoS-QA, a dataset of German and English questions paired with relevant trustworthy documents and manually annotated answers, specifically tailored to this scenario. Questions are automatically generated with an open-source large language model (LLM) and answer sentences are selected by crowd workers with high agreement. With our data, we conduct a comparison of 5 pretrained LLMs on the task of extractive question answering (QA) in German and English. Across all models and both languages, we find high precision and low-to-mid recall in selecting answer sentences, which is a favorable trade-off to avoid misleading users. This performance even holds up when the question language does not match the document language. When it comes to identifying unanswerable questions given a context, there are larger differences between the two languages.

computational linguistic, dataset, proceedings, (11 more...)

2407.15736

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Dominican Republic (0.04)
(16 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government (1.00)
Government > Immigration & Customs (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-19-2024

Advancing Chart Question Answering with Robust Chart Component Recognition

Zheng, Hanwen, Wang, Sijia, Thomas, Chris, Huang, Lifu

Chart comprehension presents significant challenges for machine learning models due to the diverse and intricate shapes of charts. Existing multimodal methods often overlook these visual features or fail to integrate them effectively for chart question answering (ChartQA). To address this, we introduce Chartformer, a unified framework that enhances chart component recognition by accurately identifying and classifying components such as bars, lines, pies, titles, legends, and axes. Additionally, we propose a novel Question-guided Deformable Co-Attention (QDCAt) mechanism, which fuses chart features encoded by Chartformer with the given question, leveraging the question's guidance to ground the correct answer. Extensive experiments demonstrate that the proposed approaches significantly outperform baseline models in chart component recognition and ChartQA tasks, achieving improvements of 3.2% in mAP and 15.4% in accuracy, respectively. These results underscore the robustness of our solution for detailed visual data interpretation across various applications.

arxiv, chart component recognition, qdc hart, (9 more...)

2407.21038

Country:

North America > United States > Virginia (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Singh, Abhishek Kumar, Murthy, Rudra, kumar, Vishwajeet, Sen, Jaydeep, Ramakrishnan, Ganesh

INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages

arXiv.org Artificial IntelligenceJul-18-2024

Large Language Models (LLMs) have demonstrated remarkable zero-shot and few-shot capabilities in unseen tasks, including context-grounded question answering (QA) in English. However, the evaluation of LLMs' capabilities in non-English languages for context-based QA is limited by the scarcity of benchmarks in non-English languages. To address this gap, we introduce Indic-QA, the largest publicly available context-grounded question-answering dataset for 11 major Indian languages from two language families. The dataset comprises both extractive and abstractive question-answering tasks and includes existing datasets as well as English QA datasets translated into Indian languages. Additionally, we generate a synthetic dataset using the Gemini model to create question-answer pairs given a passage, which is then manually verified for quality assurance. We evaluate various multilingual Large Language Models and their instruction-fine-tuned variants on the benchmark and observe that their performance is subpar, particularly for low-resource languages. We hope that the release of this dataset will stimulate further research on the question-answering abilities of LLMs for low-resource languages.

benchmark, dataset, indian language, (13 more...)

2407.13522

Country:

Asia > India (0.05)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

arXiv.org Artificial IntelligenceJul-17-2024

Continual Learning for Temporal-Sensitive Question Answering

Yang, Wanqi, Xu, Yunqiu, Li, Yanda, Wang, Kunze, Huang, Binbin, Chen, Ling

In this study, we explore an emerging research area of Continual Learning for Temporal Sensitive Question Answering (CLTSQA). Previous research has primarily focused on Temporal Sensitive Question Answering (TSQA), often overlooking the unpredictable nature of future events. In real-world applications, it's crucial for models to continually acquire knowledge over time, rather than relying on a static, complete dataset. Our paper investigates strategies that enable models to adapt to the ever-evolving information landscape, thereby addressing the challenges inherent in CLTSQA. To support our research, we first create a novel dataset, divided into five subsets, designed specifically for various stages of continual learning. We then propose a training framework for CLTSQA that integrates temporal memory replay and temporal contrastive learning. Our experimental results highlight two significant insights: First, the CLTSQA task introduces unique challenges for existing models. Second, our proposed framework effectively navigates these challenges, resulting in improved performance.

dataset, knowledge, subset, (12 more...)

2407.1247

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Oceania > Australia (0.04)
North America > United States > California (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry: Government > Regional Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.83)

SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering

Yang, Zhe, Li, Wenrui, Cheng, Guanghui

The Audio-Visual Question Answering (AVQA) task holds significant potential for applications. Compared to traditional unimodal approaches, the multi-modal input of AVQA makes feature extraction and fusion processes more challenging. Euclidean space is difficult to effectively represent multi-dimensional relationships of data. Especially when extracting and processing data with a tree structure or hierarchical structure, Euclidean space is not suitable as an embedding space. Additionally, the self-attention mechanism in Transformers is effective in capturing the dynamic relationships between elements in a sequence. However, the self-attention mechanism's limitations in window modeling and quadratic computational complexity reduce its effectiveness in modeling long sequences. To address these limitations, we propose SHMamba: Structured Hyperbolic State Space Model to integrate the advantages of hyperbolic geometry and state space models. Specifically, SHMamba leverages the intrinsic properties of hyperbolic space to represent hierarchical structures and complex relationships in audio-visual data. Meanwhile, the state space model captures dynamic changes over time by globally modeling the entire sequence. Furthermore, we introduce an adaptive curvature hyperbolic alignment module and a cross fusion block to enhance the understanding of hierarchical structures and the dynamic exchange of cross-modal information, respectively. Extensive experiments demonstrate that SHMamba outperforms previous methods with fewer parameters and computational costs. Our learnable parameters are reduced by 78.12\%, while the average performance improves by 2.53\%. Experiments show that our method demonstrates superiority among all current major methods and is more suitable for practical application scenarios.

hyperbolic space, proceedings, shmamba, (14 more...)

2406.09833

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Jilin Province (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Wen, Haoyang, Zhuang, Honglei, Zamani, Hamed, Hauptmann, Alexander, Bendersky, Michael

Multimodal Reranking for Knowledge-Intensive Visual Question Answering

Knowledge-intensive visual question answering requires models to effectively use external knowledge to help answer visual questions. A typical pipeline includes a knowledge retriever and an answer generator. However, a retriever that utilizes local information, such as an image patch, may not provide reliable question-candidate relevance scores. Besides, the two-tower architecture also limits the relevance score modeling of a retriever to select top candidates for answer generator reasoning. In this paper, we introduce an additional module, a multi-modal reranker, to improve the ranking quality of knowledge candidates for answer generation. Our reranking module takes multi-modal information from both candidates and questions and performs cross-item interaction for better relevance score modeling. Experiments on OK-VQA and A-OKVQA show that multi-modal reranker from distant supervision provides consistent improvements. We also find a training-testing discrepancy with reranking in answer generation, where performance improves if training knowledge candidates are similar to or noisier than those used in testing.

knowledge candidate, proceedings, representation, (15 more...)

2407.12277

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(15 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

QOG:Question and Options Generation based on Language Model

Zhou, Jincheng

Question-Options Generation (QOG) is a task that involves generating a set of question-options pairs given context. This task has various applications, including fine-tuning large models, information retrieval, and automated multiple-choice question generation for education. In this paper, we develop QOG models using three different methods based on fine-tuning sequence-to-sequence language models (LMs). Experiments demonstrate that the end-to-end QOG model is computationally efficient and stable during both training and inference, outperforming other methods. Furthermore, our analysis indicates that our QOG models are competitive on the QOG task compared to the large language model Llama 3-8B.

arxiv preprint arxiv, language model, qog model, (12 more...)

2406.12381

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Education (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Sachdeva, Rachneet, Song, Yixiao, Iyyer, Mohit, Gurevych, Iryna

Fine-grained Hallucination Detection and Mitigation in Long-form Question Answering

Long-form question answering (LFQA) aims to provide thorough and in-depth answers to complex questions, enhancing comprehension. However, such detailed responses are prone to hallucinations and factual inconsistencies, challenging their faithful evaluation. This work introduces HaluQuestQA, the first hallucination dataset with localized error annotations for human-written and model-generated LFQA answers. HaluQuestQA comprises 698 QA pairs with 4.7k span-level error annotations for five different error types by expert annotators, along with preference judgments. Using our collected data, we thoroughly analyze the shortcomings of long-form answers and find that they lack comprehensiveness and provide unhelpful references. We train an automatic feedback model on this dataset that predicts error spans with incomplete information and provides associated explanations. Finally, we propose a prompt-based approach, Error-informed refinement, that uses signals from the learned feedback model to refine generated answers, which we show reduces hallucination and improves answer quality. Furthermore, humans find answers generated by our approach comprehensive and highly prefer them (84%) over the baseline answers.

dataset, evaluation, feedback model, (16 more...)

2407.1193

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Singapore (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(11 more...)

Genre: Research Report (0.82)

Industry:

Education (0.67)
Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.90)