AITopics | question-answering

Collaborating Authors

question-answering

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text

Qamar, Faiza, Latif, Seemab, Latif, Rabia

arXiv.org Artificial IntelligenceSep-15-2024

Accessing and comprehending religious texts, particularly the Quran (the sacred scripture of Islam) and Ahadith (the corpus of the sayings or traditions of the Prophet Muhammad), in today's digital era necessitates efficient and accurate Question-Answering (QA) systems. Yet, the scarcity of QA systems tailored specifically to the detailed nature of inquiries about the Quranic Tafsir (explanation, interpretation, context of Quran for clarity) and Ahadith poses significant challenges. To address this gap, we introduce a comprehensive dataset meticulously crafted for QA purposes within the domain of Quranic Tafsir and Ahadith. This dataset comprises a robust collection of over 73,000 question-answer pairs, standing as the largest reported dataset in this specialized domain. Importantly, both questions and answers within the dataset are meticulously enriched with contextual information, serving as invaluable resources for training and evaluating tailored QA systems. However, while this paper highlights the dataset's contributions and establishes a benchmark for evaluating QA performance in the Quran and Ahadith domains, our subsequent human evaluation uncovered critical insights regarding the limitations of existing automatic evaluation techniques. The discrepancy between automatic evaluation metrics, such as ROUGE scores, and human assessments became apparent. The human evaluation indicated significant disparities: the model's verdict consistency with expert scholars ranged between 11% to 20%, while its contextual understanding spanned a broader spectrum of 50% to 90%. These findings underscore the necessity for evaluation techniques that capture the nuances and complexities inherent in understanding religious texts, surpassing the limitations of traditional automatic metrics.

ahadith, dataset, evaluation, (14 more...)

arXiv.org Artificial Intelligence

2409.09844

Country:

Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice

Li, Jonathan, Bhambhoria, Rohan, Dahan, Samuel, Zhu, Xiaodan

arXiv.org Artificial IntelligenceSep-11-2024

Generative AI models, such as the GPT and Llama series, have significant potential to assist laypeople in answering legal questions. However, little prior work focuses on the data sourcing, inference, and evaluation of these models in the context of laypersons. To this end, we propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation. We introduce and release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law, corresponding answers written by legal experts, and citations for each answer. We develop an automatic evaluation protocol for this dataset, then show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval, despite containing 9 orders of magnitude less data. Finally, we propose future directions for open-sourced efforts, which fall behind closed-sourced models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.07713

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Kingston (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

RAG based Question-Answering for Contextual Response Prediction System

Veturi, Sriram, Vaichal, Saurabh, Jagadheesh, Reshma Lal, Tripto, Nafis Irtiza, Yan, Nian

arXiv.org Artificial IntelligenceSep-6-2024

Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks, including their potential as effective question-answering systems. However, to provide precise and relevant information in response to specific customer queries in industry settings, LLMs require access to a comprehensive knowledge base to avoid hallucinations. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. Yet, developing an accurate question-answering framework for real-world applications using RAG entails several challenges: 1) data availability issues, 2) evaluating the quality of generated content, and 3) the costly nature of human evaluation. In this paper, we introduce an end-to-end framework that employs LLMs with RAG capabilities for industry use cases. Given a customer query, the proposed system retrieves relevant knowledge documents and leverages them, along with previous chat history, to generate response suggestions for customer service agents in the contact centers of a major retail company. Through comprehensive automated and human evaluations, we show that this solution outperforms the current BERT-based algorithms in accuracy and relevance. Our findings suggest that RAG-based LLMs can be an excellent support to human customer service representatives by lightening their workload.

arxiv, evaluation, language model, (13 more...)

arXiv.org Artificial Intelligence

2409.03708

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Georgia > Cobb County (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Retail (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation

Bohnet, Bernd, Swersky, Kevin, Liu, Rosanne, Awasthi, Pranjal, Nova, Azade, Snaider, Javier, Sedghi, Hanie, Parisi, Aaron T, Collins, Michael, Lazaridou, Angeliki, Firat, Orhan, Fiedel, Noah

arXiv.org Artificial IntelligenceMay-31-2024

We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Previous efforts to construct such datasets relied on crowd-sourcing [1], but the emergence of transformers with a context size of 1 million or more tokens [2] now enables entirely automatic approaches. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text, such as questions involving character arcs, broader themes, or the consequences of early actions later in the story. We propose a holistic pipeline for automatic data generation including question generation, answering, and model scoring using an "Evaluator". We find that a relative approach, comparing answers between models in a pairwise fashion and ranking with a Bradley-Terry model, provides a more consistent and differentiating scoring mechanism than an absolute scorer that rates answers individually. We also show that LLMs from different model families produce moderate agreement in their ratings. We ground our approach using the manually curated NarrativeQA dataset, where our evaluator shows excellent agreement with human judgement and even finds errors in the dataset. Using our automatic evaluation approach, we show that using an entire book as context produces superior reading comprehension performance compared to baseline no-context (parametric knowledge only) and retrieval-based approaches.

full-context vs gemini 1, full-context vs gpt 4, symbolize resilience, (13 more...)

arXiv.org Artificial Intelligence

2406.00179

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Utah (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(14 more...)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.45)

Industry:

Health & Medicine (0.92)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering

Anand, Avinash, Kapuriya, Janak, Kirtani, Chhavi, Singh, Apoorv, Saraf, Jay, Lal, Naman, Kumar, Jatin, Shivam, Adarsh Raj, Verma, Astha, Shah, Rajiv Ratn, Zimmermann, Roger

arXiv.org Artificial IntelligenceApr-19-2024

Recent advancements in LLMs have shown their significant potential in tasks like text summarization and generation. Yet, they often encounter difficulty while solving complex physics problems that require arithmetic calculation and a good understanding of concepts. Moreover, many physics problems include images that contain important details required to understand the problem's context. We propose an LMM-based chatbot to answer multimodal physics MCQs. For domain adaptation, we utilize the MM-PhyQA dataset comprising Indian high school-level multimodal physics problems. To improve the LMM's performance, we experiment with two techniques, RLHF (Reinforcement Learning from Human Feedback) and Image Captioning. In image captioning, we add a detailed explanation of the diagram in each image, minimizing hallucinations and image processing errors. We further explore the integration of Reinforcement Learning from Human Feedback (RLHF) methodology inspired by the ranking approach in RLHF to enhance the human-like problem-solving abilities of the models. The RLHF approach incorporates human feedback into the learning process of LLMs, improving the model's problem-solving skills, truthfulness, and reasoning capabilities, minimizing the hallucinations in the answers, and improving the quality instead of using vanilla-supervised fine-tuned models. We employ the LLaVA open-source model to answer multimodal physics MCQs and compare the performance with and without using RLHF.

dataset, llm, sample question and answer, (9 more...)

arXiv.org Artificial Intelligence

2404.12926

Country:

Asia > India > NCT > Delhi (0.06)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry: Education > Educational Setting > K-12 Education > Secondary School (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Question-Answering Based Summarization of Electronic Health Records using Retrieval Augmented Generation

Saba, Walid, Wendelken, Suzanne, Shanahan, James.

arXiv.org Artificial IntelligenceJan-2-2024

Summarization of electronic health records (EHRs) can substantially minimize 'screen time' for both patients as well as medical personnel. In recent years summarization of EHRs have employed machine learning pipelines using state of the art neural models. However, these models have produced less than adequate results that are attributed to the difficulty of obtaining sufficient annotated data for training. Moreover, the requirement to consider the entire content of an EHR in summarization has resulted in poor performance due to the fact that attention mechanisms in modern large language models (LLMs) adds a quadratic complexity in terms of the size of the input. We propose here a method that mitigates these shortcomings by combining semantic search, retrieval augmented generation (RAG) and question-answering using the latest LLMs. In our approach summarization is the extraction of answers to specific questions that are deemed important by subject-matter experts (SMEs). Our approach is quite efficient; requires minimal to no training; does not suffer from the 'hallucination' problem of LLMs; and it ensures diversity, since the summary will not have repeated content but diverse answers to specific questions.

electronic health record, specific question, summarization, (11 more...)

arXiv.org Artificial Intelligence

2401.01469

Country:

North America > United States > Maine > Cumberland County > Portland (0.06)
Asia > Middle East > Israel (0.05)

Genre: Research Report (0.66)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs

Terdalkar, Hrishikesh, Bhattacharya, Arnab

arXiv.org Artificial IntelligenceOct-11-2023

Sanskrit (sa\d{m}sk\d{r}ta) enjoys one of the largest and most varied literature in the whole world. Extracting the knowledge from it, however, is a challenging task due to multiple reasons including complexity of the language and paucity of standard natural language processing tools. In this paper, we target the problem of building knowledge graphs for particular types of relationships from sa\d{m}sk\d{r}ta texts. We build a natural language question-answering system in sa\d{m}sk\d{r}ta that uses the knowledge graph to answer factoid questions. We design a framework for the overall system and implement two separate instances of the system on human relationships from mah\=abh\=arata and r\=am\=aya\d{n}a, and one instance on synonymous relationships from bh\=avaprak\=a\'sa nigha\d{n}\d{t}u, a technical text from \=ayurveda. We show that about 50% of the factoid questions can be answered correctly by the system. More importantly, we analyse the shortcomings of the system in detail for each step, and discuss the possible ways forward.

automated construction, knowledge graph, question-answering, (1 more...)

arXiv.org Artificial Intelligence

2310.07848

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.80)

Add feedback

Question-Answering on Textbooks by Searching and Ranking

#artificialintelligenceOct-10-2022, 15:25:14 GMT

Question Answering is a popular application of NLP. Transformer models trained on big datasets have dramatically improved the state-of-the-art results on Question Answering. The question answering task can be formulated in many ways. The most common application is an extractive question answering on a small context. The SQuAD dataset is a popular dataset where given a passage and a question, the model selects the word(s) representing the answer.

artificial intelligence, natural language, question answering, (17 more...)

#artificialintelligence

Genre:

Summary/Review (0.40)
Instructional Material > Text Book (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering

Siriwardhana, Shamane, Weerasekera, Rivindu, Wen, Elliott, Nanayakkara, Suranga

arXiv.org Artificial IntelligenceJun-21-2021

In September 2020, Facebook open-sourced a new NLP model called Retrieval Augmented Generation (RAG) on the Hugging Face Transformer library. RAG is capable to use a set of support documents from an external knowledge base as a latent variable to generate the final output. The RAG model consists of an Input Encoder, a Neural Retriever, and an Output Generator. All three components are initialized with pre-trained transformers. However, the original Hugging Face implementation only allowed fine-tuning the Input Encoder and the Output Generator in an end-toend manner, while the Neural Retriever needs to be trained seperately. To the best of our knowledge, an end-to-end RAG implementation that trains all three components does not exist.

machine learning, natural language, question answering, (16 more...)

arXiv.org Artificial Intelligence

2106.11517

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.10)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Intent Classification in Question-Answering Using LSTM Architectures

Di Gennaro, Giovanni, Buonanno, Amedeo, Di Girolamo, Antonio, Ospedale, Armando, Palmieri, Francesco A. N.

arXiv.org Machine LearningJan-25-2020

Question-answering (QA) is certainly the best known and probably also one of the most complex problem within Natural Language Processing (NLP) and artificial intelligence (AI). Since the complete solution to the problem of finding a generic answer still seems far away, the wisest thing to do is to break down the problem by solving single simpler parts. Assuming a modular approach to the problem, we confine our research to intent classification for an answer, given a question. Through the use of an LSTM network, we show how this type of classification can be approached effectively and efficiently, and how it can be properly used within a basic prototype responder. Keywords: Deep Learning, LSTM, Intent classification, Question-Answering 1 Introduction Despite the remarkable results obtained in the different areas of Natural Language Processing, the solution to the Question-Answering problem, in its general sense, still seems far away [1].

information, intent classification, question-answering, (14 more...)

arXiv.org Machine Learning

2001.0933

Country:

North America > United States (0.28)
Europe > Italy > Campania (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback