AITopics | Question Answering

Collaborating Authors

Question Answering

"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.

News Overviews Instructional Materials AI-Alerts Classics

Building a Rich Dataset to Empower the Persian Question Answering Systems

Yazdinejad, Mohsen, Kaedi, Marjan

arXiv.org Artificial IntelligenceDec-28-2024

Question answering systems provide short, precise, and specific answers to questions. So far, many robust question answering systems have been developed for English, while some languages with fewer resources, like Persian, have few numbers of standard dataset. In this study, a comprehensive open-domain dataset is presented for Persian. This dataset is called NextQuAD and has 7,515 contexts, including 23,918 questions and answers. Then, a BERT-based question answering model has been applied to this dataset using two pre-trained language models, including ParsBERT and XLM-RoBERTa. The results of these two models have been ensembled using mean logits. Evaluation on the development set shows 0.95 Exact Match (EM) and 0.97 Fl_score. Also, to compare the NextQuAD with other Persian datasets, our trained model on the NextQuAD, is evaluated on two other datasets named PersianQA and ParSQuAD. Comparisons show that the proposed model increased EM by 0.39 and 0.14 respectively in PersianQA and ParSQuAD-manual, while a slight EM decline of 0.007 happened in ParSQuAD-automatic.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2412.20212

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(4 more...)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering

Ni, Shiwen, Cheng, Hao, Yang, Min

arXiv.org Artificial IntelligenceDec-27-2024

Legal question answering (QA) has attracted increasing attention from people seeking legal advice, which aims to retrieve the most applicable answers from a large-scale database of question-answer pairs. Previous methods mainly use a dual-encoder architecture to learn dense representations of both questions and answers. However, these methods could suffer from lacking domain knowledge and sufficient labeled training data. In this paper, we propose a three-stage (\underline{p}re-training, \underline{f}ine-tuning and \underline{r}e-ranking) framework for \underline{l}egal \underline{QA} (called PFR-LQA), which promotes the fine-grained text representation learning and boosts the performance of dense retrieval with the dual-encoder architecture. Concretely, we first conduct domain-specific pre-training on legal questions and answers through a self-supervised training objective, allowing the pre-trained model to be adapted to the legal domain. Then, we perform task-specific fine-tuning of the dual-encoder on legal question-answer pairs by using the supervised learning objective, leading to a high-quality dual-encoder for the specific downstream QA task. Finally, we employ a contextual re-ranking objective to further refine the output representations of questions produced by the document encoder, which uses contextual similarity to increase the discrepancy between the anchor and hard negative samples for better question re-ranking. We conduct extensive experiments on a manually annotated legal QA dataset. Experimental results show that our PFR-LQA method achieves better performance than the strong competitors for legal question answering.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2412.19482

Country: Asia > China > Guangdong Province (0.15)

Genre: Research Report > New Finding (0.34)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.95)

Add feedback

Fine-grained and Explainable Factuality Evaluation for Multimodal Summarization

Zhang, Yue, Zuo, Jingxuan, Jing, Liqiang

arXiv.org Artificial IntelligenceDec-27-2024

Multimodal summarization aims to generate a concise summary based on the input text and image. However, the existing methods potentially suffer from unfactual output. To evaluate the factuality of multimodal summarization models, we propose two fine-grained and explainable evaluation frameworks (FALLACIOUS) for different application scenarios, i.e. reference-based factuality evaluation framework and reference-free factuality evaluation framework. Notably, the reference-free factuality evaluation framework doesn't need ground truth and hence it has a wider application scenario. To evaluate the effectiveness of the proposed frameworks, we compute the correlation between our frameworks and the other metrics. The experimental results show the effectiveness of our proposed method. We will release our code and dataset via github.

computational linguistic, machine learning, question answering, (15 more...)

arXiv.org Artificial Intelligence

2402.11414

Country:

Europe (1.00)
Asia (0.69)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.70)

Industry: Government > Military (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.50)

Add feedback

AskChart: Universal Chart Understanding through Textual Enhancement

Yang, Xudong, Wu, Yifan, Zhu, Yizhang, Tang, Nan, Luo, Yuyu

arXiv.org Artificial IntelligenceDec-26-2024

Chart understanding tasks such as ChartQA and Chart-to-Text involve automatically extracting and interpreting key information from charts, enabling users to query or convert visual data into structured formats. State-of-the-art approaches primarily focus on visual cues from chart images, failing to explicitly incorporate rich textual information (e.g., data labels and axis labels) embedded within the charts. This textual information is vital for intuitive human comprehension and interpretation of charts. Moreover, existing models are often large and computationally intensive, limiting their practical applicability. In this paper, we introduce AskChart, a universal model that explicitly integrates both textual and visual cues from charts using a Mixture of Experts (MoE) architecture. AskChart facilitates the learning of enhanced visual-textual representations of charts for effectively handling multiple chart understanding tasks, while maintaining a smaller model size. To capture the synergy between visual and textual modalities, we curate a large-scale dataset named ChartBank with about 7.5M data samples, which helps align textual and visual information and facilitates the extraction of visual entities and text. To effectively train AskChart, we design a three-stage training strategy to align visual and textual modalities for learning robust visual-textual representations and optimizing the learning of the MoE layer. Extensive experiments across five datasets demonstrate the significant performance gains of AskChart in four chart understanding tasks. Remarkably, AskChart with 4.6B parameters outperforms state-of-the-art models with 13B parameters by 68.3% in Open-ended ChartQA and 49.2% in Chart-to-Text tasks, while achieving comparable performance in ChartQA and Chart-to-Table tasks.

askchart, dataset, information, (13 more...)

arXiv.org Artificial Intelligence

2412.19146

Country:

North America > United States (0.28)
Asia > Myanmar (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(16 more...)

Genre: Research Report > Promising Solution (0.68)

Industry:

Government (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Unlocking the Potential of Multiple BERT Models for Bangla Question Answering in NCTB Textbooks

Khondoker, Abdullah, Taufik, Enam Ahmed, Tashik, Md Iftekhar Islam, mahmud, S M Ishtiak, Parsa, Antara Firoz

arXiv.org Artificial IntelligenceDec-24-2024

Evaluating text comprehension in educational settings is critical for understanding student performance and improving curricular effectiveness. This study investigates the capability of state-of-the-art language models--RoBERTa Base, Bangla-BERT, and BERT Base--in automatically assessing Bangla passage-based question-answering from the National Curriculum and Textbook Board (NCTB) textbooks for classes 6-10. A dataset of approximately 3,000 Bangla passage-based questionanswering instances was compiled, and the models were evaluated using F1 Score and Exact Match (EM) metrics across various hyperparameter configurations. Our findings revealed that Bangla-BERT consistently outperformed the other models, achieving the highest F1 (0.75) and EM (0.53) scores, particularly with smaller batch sizes, the inclusion of stop words, and a moderate learning rate. In contrast, RoBERTa Base demonstrated the weakest performance, with the lowest F1 (0.19) and EM (0.27) scores under certain configurations. The results underscore the importance of fine-tuning hyperparameters for optimizing model performance and highlight the potential of machine learning models in evaluating text comprehension in educational contexts. However, limitations such as dataset size, spelling inconsistencies, and computational constraints emphasize the need for further research to enhance the robustness and applicability of these models. This study lays the groundwork for the future development of automated evaluation systems in educational institutions, providing critical insights into model performance in the context of Bangla text comprehension.

machine learning, natural language, question answering, (21 more...)

arXiv.org Artificial Intelligence

2412.1844

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting (0.54)
Education > Assessment & Standards > Student Performance (0.35)
Education > Curriculum (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Knowledge Graphs are all you need: Leveraging KGs in Physics Question Answering

Addala, Krishnasai, Baghel, Kabir Dev Paul, Jain, Dhruv, Kirtani, Chhavi, Anand, Avinash, Shah, Rajiv Ratn

arXiv.org Artificial IntelligenceDec-23-2024

This study explores the effectiveness of using knowledge graphs generated by large language models to decompose high school-level physics questions into sub-questions. We introduce a pipeline aimed at enhancing model response quality for Question Answering tasks. By employing LLMs to construct knowledge graphs that capture the internal logic of the questions, these graphs then guide the generation of subquestions. We hypothesize that this method yields sub-questions that are more logically consistent with the original questions compared to traditional decomposition techniques. Our results show that sub-questions derived from knowledge graphs exhibit significantly improved fidelity to the original question's logic. This approach not only enhances the learning experience by providing clearer and more contextually appropriate sub-questions but also highlights the potential of LLMs to transform educational methodologies. The findings indicate a promising direction for applying AI to improve the quality and effectiveness of educational content.

large language model, machine learning, question answering, (13 more...)

arXiv.org Artificial Intelligence

2412.05453

Country:

Asia > India > NCT > Delhi (0.05)
Asia > India > Uttar Pradesh (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering

Weir, Nathaniel, Mishra, Bhavana Dalvi, Weller, Orion, Tafjord, Oyvind, Hornstein, Sam, Sabol, Alexander, Jansen, Peter, Van Durme, Benjamin, Clark, Peter

arXiv.org Artificial IntelligenceDec-23-2024

Recent reasoning methods (e.g., chain-of-thought, entailment reasoning) help users understand how language models (LMs) answer a single question, but they do little to reveal the LM's overall understanding, or "theory," about the question's topic, making it still hard to trust the model. Our goal is to materialize such theories - here called microtheories (a linguistic analog of logical microtheories) - as a set of sentences encapsulating an LM's core knowledge about a topic. These statements systematically work together to entail answers to a set of questions to both engender trust and improve performance. Our approach is to first populate a knowledge store with (model-generated) sentences that entail answers to training questions and then distill those down to a core microtheory that is concise, general, and non-redundant. We show that, when added to a general corpus (e.g., Wikipedia), microtheories can supply critical, topical information not necessarily present in the corpus, improving both a model's ability to ground its answers to verifiable knowledge (i.e., show how answers are systematically entailed by documents in the corpus, fully grounding up to +8% more answers), and the accuracy of those grounded answers (up to +8% absolute). We also show that, in a human evaluation in the medical domain, our distilled microtheories contain a significantly higher concentration of topically critical facts than the non-distilled knowledge store. Finally, we show we can quantify the coverage of a microtheory for a topic (characterized by a dataset) using a notion of $p$-relevance. Together, these suggest that microtheories are an efficient distillation of an LM's topic-relevant knowledge, that they can usefully augment existing corpora, and can provide both performance gains and an interpretable, verifiable window into the model's knowledge of a topic.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2412.17701

Country:

Asia (1.00)
Europe (0.92)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.41)

Add feedback

DragonVerseQA: Open-Domain Long-Form Context-Aware Question-Answering

Lahiri, Aritra Kumar, Hu, Qinmin Vivian

arXiv.org Artificial IntelligenceDec-21-2024

This paper proposes a novel approach to develop an open-domain and long-form Over-The-Top (OTT) Question-Answering (QA) dataset, DragonVerseQA, specifically oriented to the fantasy universe of "House of the Dragon" and "Game Of Thrones" TV series. Most existing QA datasets focus on short, fact-based answers sourced almost solely from Wikipedia articles, devoid of depth and contextual richness for sophisticated narrative understanding. We curate a dataset that combines full episode summaries sourced from HBO and fandom wiki websites, user reviews from sources like IMDb and Rotten Tomatoes, and high-quality, open-domain, legally admissible sources, and structured data from repositories like WikiData into one dataset. The dataset provides a multi-dimensional context, reflecting complex character dynamics and plot developments from these varied sources. That means, on equal footing, only after heavy data preprocessing and filtering methods will meaningful, non-spam unbiased reviews be available in this enriched dataset. The comprehensive insights are given through the long-form answers generated from this enriched context. This is what makes this valuable dataset for improving conversational AI, narrative analysis, sentiment analysis, summarization techniques, and relation extraction. A comparative analysis with state-of-the-art QA datasets such as SQuAD 2.0, TriviaQA, and Natural Questions brings to light the unique advantages of our dataset in terms of contextual complexity and answer length. Detailed reviews add layers to audience sentiment and narrative interpretation, raising the bar for domain-specific QA with a new quality benchmark. Our work also allows a deeper understanding of entertainment-industry content and opens the door to more knowledgeable and creative AI-driven interactions within digital media environments.

large language model, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2412.16694

Country:

North America > Canada > Ontario > Toronto (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre:

Overview (0.88)
Research Report (0.70)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

Domain-specific Question Answering with Hybrid Search

Sultania, Dewang, Lu, Zhaoyu, Naik, Twisha, Dernoncourt, Franck, Yoon, David Seunghyun, Sharma, Sanat, Bui, Trung, Gupta, Ashok, Vatsa, Tushar, Suresha, Suhas, Verma, Ishita, Belavadi, Vibha, Chen, Cheng, Friedrich, Michael

arXiv.org Artificial IntelligenceDec-21-2024

With the increasing adoption of Large Language Models A production-ready, generalizable framework for LLMbased (LLMs) in enterprise settings, ensuring accurate and reliable QA systems built on Elasticsearch question-answering systems remains a critical challenge. A flexible hybrid retrieval mechanism combining dense Building upon our previous work on domain-specific and sparse search methods question answering about Adobe products (Sharma et al. A comprehensive evaluation framework for assessing 2024), which established a retrieval-aware framework with QA system performance self-supervised training, we now present a production-ready, Empirical analysis demonstrating the effectiveness of our generalizable architecture alongside a comprehensive evaluation approach across various metrics methodology. Our core contribution is a flexible, scalable framework built on Elasticsearch that can be adapted Through this work, we provide not only theoretical insights for any LLM-based question-answering system. This framework but also a practical, deployable solution for building reliable seamlessly integrates hybrid retrieval mechanisms, domain-specific question-answering systems that can combining dense and sparse search with boost matching, be adapted to various enterprise needs.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2412.03736

Country:

North America > United States > California > Santa Clara County > San Jose (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Query pipeline optimization for cancer patient question answering systems

He, Maolin, Gao, Rena, Conway, Mike, Chapman, Brian E.

arXiv.org Artificial IntelligenceDec-19-2024

Retrieval-augmented generation (RAG) mitigates hallucination in Large Language Models (LLMs) by using query pipelines to retrieve relevant external information and grounding responses in retrieved knowledge. However, query pipeline optimization for cancer patient question-answering (CPQA) systems requires separately optimizing multiple components with domain-specific considerations. We propose a novel three-aspect optimization approach for the RAG query pipeline in CPQA systems, utilizing public biomedical databases like PubMed and PubMed Central. Our optimization includes: (1) document retrieval, utilizing a comparative analysis of NCBI resources and introducing Hybrid Semantic Real-time Document Retrieval (HSRDR); (2) passage retrieval, identifying optimal pairings of dense retrievers and rerankers; and (3) semantic representation, introducing Semantic Enhanced Overlap Segmentation (SEOS) for improved contextual understanding. On a custom-developed dataset tailored for cancer-related inquiries, our optimized RAG approach improved the answer accuracy of Claude-3-haiku by 5.24% over chain-of-thought prompting and about 3% over a naive RAG setup. This study highlights the importance of domain-specific query optimization in realizing the full potential of RAG and provides a robust framework for building more accurate and reliable CPQA systems, advancing the development of RAG-based biomedical systems.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2412.14751

Country:

Oceania > Australia (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Middle East > UAE (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback