AITopics | Question Answering

Collaborating Authors

Question Answering

"Questions are asked and answered every day. Question answering (QA) technology aims to deliver the same facility online. It goes further than the more familiar search based on keywords (as in Google, Yahoo, and other search engines), in attempting to recognize what a question expresses and to respond with an actual answer. This simplifies things for users in two ways. First, questions do not often translate into a simple list of keywords. ...Second, QA takes responsibility for providing answers, rather than a searchable list of links to potentially relevant documents (web pages), highlighted by snippets of text that show how the query matched the documents."
– from Bonnie Webber & Nick Webb. Question Answering. In The Handbook of Computational Linguistics and Natural Language Processing. Alexander Clark, Chris Fox, Shalom Lappin (Eds.). Wiley, 2010.

News Overviews Instructional Materials AI-Alerts Classics

Question-Answering System Extracts Information on Injection Drug Use from Clinical Notes

Mahbub, Maria, Goethert, Ian, Danciu, Ioana, Knight, Kathryn, Srinivasan, Sudarshan, Tamang, Suzanne, Rozenberg-Ben-Dror, Karine, Solares, Hugo, Martins, Susana, Trafton, Jodie, Begoli, Edmon, Peterson, Gregory

arXiv.org Artificial IntelligenceDec-28-2023

Background: Injection drug use (IDU) is a dangerous health behavior that increases mortality and morbidity. Identifying IDU early and initiating harm reduction interventions can benefit individuals at risk. However, extracting IDU behaviors from patients' electronic health records (EHR) is difficult because there is no International Classification of Disease (ICD) code and the only place IDU information can be indicated is unstructured free-text clinical notes. Although natural language processing can efficiently extract this information from unstructured data, there are no validated tools. Methods: To address this gap in clinical information, we design and demonstrate a question-answering (QA) framework to extract information on IDU from clinical notes. Our framework involves two main steps: (1) generating a gold-standard QA dataset and (2) developing and testing the QA model. We utilize 2323 clinical notes of 1145 patients sourced from the VA Corporate Data Warehouse to construct the gold-standard dataset for developing and evaluating the QA model. We also demonstrate the QA model's ability to extract IDU-related information on temporally out-of-distribution data. Results: Here we show that for a strict match between gold-standard and predicted answers, the QA model achieves 51.65% F1 score. For a relaxed match between the gold-standard and predicted answers, the QA model obtains 78.03% F1 score, along with 85.38% Precision and 79.02% Recall scores. Moreover, the QA model demonstrates consistent performance when subjected to temporally out-of-distribution data. Conclusions: Our study introduces a QA framework designed to extract IDU information from clinical notes, aiming to enhance the accurate and efficient detection of people who inject drugs, extract relevant information, and ultimately facilitate informed patient care.

clinical note, information, qa model, (15 more...)

arXiv.org Artificial Intelligence

2305.08777

Country:

North America > United States > Tennessee > Knox County > Knoxville (0.14)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

S2M: Converting Single-Turn to Multi-Turn Datasets for Conversational Question Answering

Li, Baokui, Zhang, Sen, Zhang, Wangshu, Chen, Yicheng, Yang, Changlin, Hu, Sen, Xu, Teng, liu, Siye, Li, Jiwei

arXiv.org Artificial IntelligenceDec-27-2023

Supplying data augmentation to conversational question answering (CQA) can effectively improve model performance. However, there is less improvement from single-turn datasets in CQA due to the distribution gap between single-turn and multi-turn datasets. On the other hand, while numerous single-turn datasets are available, we have not utilized them effectively. To solve this problem, we propose a novel method to convert single-turn datasets to multi-turn datasets. The proposed method consists of three parts, namely, a QA pair Generator, a QA pair Reassembler, and a question Rewriter. Given a sample consisting of context and single-turn QA pairs, the Generator obtains candidate QA pairs and a knowledge graph based on the context. The Reassembler utilizes the knowledge graph to get sequential QA pairs, and the Rewriter rewrites questions from a conversational perspective to obtain a multi-turn dataset S2M. Our experiments show that our method can synthesize effective training resources for CQA. Notably, S2M ranks 1st place on the QuAC leaderboard at the time of submission (Aug 24th, 2022).

dataset, proceedings, qa pair, (14 more...)

arXiv.org Artificial Intelligence

2312.16511

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models

Nagireddy, Manish, Chiazor, Lamogha, Singh, Moninder, Baldini, Ioana

arXiv.org Artificial IntelligenceDec-27-2023

Current datasets for unwanted social bias auditing are limited to studying protected demographic features such as race and gender. In this work, we introduce a comprehensive benchmark that is meant to capture the amplification of social bias, via stigmas, in generative language models. Taking inspiration from social science research, we start with a documented list of 93 US-centric stigmas and curate a question-answering (QA) dataset which involves simple social situations. Our benchmark, SocialStigmaQA, contains roughly 10K prompts, with a variety of prompt styles, carefully constructed to systematically test for both social bias and model robustness. We present results for SocialStigmaQA with two open source generative language models and we find that the proportion of socially biased output ranges from 45% to 59% across a variety of decoding strategies and prompting styles. We demonstrate that the deliberate design of the templates in our benchmark (e.g., adding biasing text to the prompt or using different verbs that change the answer that indicates bias) impacts the model tendencies to generate socially biased output. Additionally, through manual evaluation, we discover problematic patterns in the generated chain-of-thought output that range from subtle bias to lack of reasoning. Warning: This paper contains examples of text which are toxic, biased, and potentially harmful.

neighbor, prompt style, stigma, (15 more...)

arXiv.org Artificial Intelligence

2312.07492

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.34)

Add feedback

q2d: Turning Questions into Dialogs to Teach Models How to Search

Bitton, Yonatan, Cohen-Ganor, Shlomi, Hakimi, Ido, Lewenberg, Yoad, Aharoni, Roee, Weinreb, Enav

arXiv.org Artificial IntelligenceDec-26-2023

One of the exciting capabilities of recent language models for dialog is their ability to independently search for relevant information to ground a given dialog response. However, obtaining training data to teach models how to issue search queries is time and resource consuming. In this work, we propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions. We prompt a large language model (PaLM) to create conversational versions of question answering datasets, and use it to improve query generation models that communicate with external search APIs to ground dialog responses. Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale. Our experiments demonstrate that: (1) For query generation on the QReCC dataset, models trained on our synthetically-generated data achieve 90%--97% of the performance of models trained on the human-generated data; (2) We can successfully generate data for training dialog models in new domains without any existing dialog data as demonstrated on the multi-hop MuSiQue and Bamboogle QA datasets. (3) We perform a thorough analysis of the generated dialogs showing that humans find them of high quality and struggle to distinguish them from human-written dialogs.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2304.14318

Country:

North America > United States > North Dakota (0.68)
Asia > Middle East (0.28)
Europe (0.28)
(3 more...)

Genre: Research Report (0.82)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas (1.00)
Media > Film (0.93)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.49)
(2 more...)

Add feedback

From Text to Multimodal: A Comprehensive Survey of Adversarial Example Generation in Question Answering Systems

Yigit, Gulsum, Amasyali, Mehmet Fatih

arXiv.org Artificial IntelligenceDec-26-2023

Integrating adversarial machine learning with Question Answering (QA) systems has emerged as a critical area for understanding the vulnerabilities and robustness of these systems. This article aims to comprehensively review adversarial example-generation techniques in the QA field, including textual and multimodal contexts. We examine the techniques employed through systematic categorization, providing a comprehensive, structured review. Beginning with an overview of traditional QA models, we traverse the adversarial example generation by exploring rule-based perturbations and advanced generative models. We then extend our research to include multimodal QA systems, analyze them across various methods, and examine generative models, seq2seq architectures, and hybrid methodologies. Our research grows to different defense strategies, adversarial datasets, and evaluation metrics and illustrates the comprehensive literature on adversarial QA. Finally, the paper considers the future landscape of adversarial question generation, highlighting potential research directions that can advance textual and multimodal QA systems in the context of adversarial challenges.

adversarial example, qa system, question generation, (13 more...)

arXiv.org Artificial Intelligence

2312.16156

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
North America > United States > Colorado (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Detection-based Intermediate Supervision for Visual Question Answering

Liu, Yuhang, Peng, Daowan, Wei, Wei, Fu, Yuanyuan, Xie, Wenfeng, Chen, Dangyang

arXiv.org Artificial IntelligenceDec-26-2023

Recently, neural module networks (NMNs) have yielded ongoing success in answering compositional visual questions, especially those involving multi-hop visual and logical reasoning. NMNs decompose the complex question into several sub-tasks using instance-modules from the reasoning paths of that question and then exploit intermediate supervisions to guide answer prediction, thereby improving inference interpretability. However, their performance may be hindered due to sketchy modeling of intermediate supervisions. For instance, (1) a prior assumption that each instance-module refers to only one grounded object yet overlooks other potentially associated grounded objects, impeding full cross-modal alignment learning; (2) IoU-based intermediate supervisions may introduce noise signals as the bounding box overlap issue might guide the model's focus towards irrelevant objects. To address these issues, a novel method, \textbf{\underline{D}}etection-based \textbf{\underline{I}}ntermediate \textbf{\underline{S}}upervision (DIS), is proposed, which adopts a generative detection framework to facilitate multiple grounding supervisions via sequence generation. As such, DIS offers more comprehensive and accurate intermediate supervisions, thereby boosting answer prediction performance. Furthermore, by considering intermediate results, DIS enhances the consistency in answering compositional questions and their sub-questions.Extensive experiments demonstrate the superiority of our proposed DIS, showcasing both improved accuracy and state-of-the-art reasoning consistency compared to prior approaches.

intermediate result, intermediate supervision, supervision, (15 more...)

arXiv.org Artificial Intelligence

2312.16012

Country: Asia > China (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.84)

Add feedback

GSQA: An End-to-End Model for Generative Spoken Question Answering

Shih, Min-Han, Chung, Ho-Lam, Pai, Yu-Chi, Hsu, Ming-Hao, Lin, Guan-Ting, Li, Shang-Wen, Lee, Hung-yi

arXiv.org Artificial IntelligenceDec-25-2023

In recent advancements in spoken question answering (QA), end-to-end models have made significant strides. However, previous research has primarily focused on extractive span selection. While this extractive-based approach is effective when answers are present directly within the input, it falls short in addressing abstractive questions, where answers are not directly extracted but inferred from the given information. To bridge this gap, we introduce the first end-to-end Generative Spoken Question Answering (GSQA) model that empowers the system to engage in abstractive reasoning. The challenge in training our GSQA model lies in the absence of a spoken abstractive QA dataset. We propose using text models for initialization and leveraging the extractive QA dataset to transfer knowledge from the text generative model to the spoken generative model. Experimental results indicate that our model surpasses the previous extractive model by 3% on extractive QA datasets. Furthermore, the GSQA model has only been fine-tuned on the spoken extractive QA dataset. Despite not having seen any spoken abstractive QA data, it can still closely match the performance of the cascade model. In conclusion, our GSQA model shows the potential to generalize to a broad spectrum of questions, thus further expanding the spoken question answering capabilities of abstractive QA. Our code is available at https://voidful.github.io/GSQA

computational linguistic, dataset, qa dataset, (15 more...)

arXiv.org Artificial Intelligence

2312.09781

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Washington > King County > Seattle (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Bae, Seongsu, Kyung, Daeun, Ryu, Jaehee, Cho, Eunbyeol, Lee, Gyubok, Kweon, Sunjun, Oh, Jungwoo, Ji, Lei, Chang, Eric I-Chao, Kim, Tackeun, Choi, Edward

arXiv.org Artificial IntelligenceDec-25-2023

Electronic Health Records (EHRs), which contain patients' medical histories in various multi-modal formats, often overlook the potential for joint reasoning across imaging and table modalities underexplored in current EHR Question Answering (QA) systems. In this paper, we introduce EHRXQA, a novel multi-modal question answering dataset combining structured EHRs and chest X-ray images. To develop our dataset, we first construct two uni-modal resources: 1) The MIMIC-CXR-VQA dataset, our newly created medical visual question answering (VQA) benchmark, specifically designed to augment the imaging modality in EHR QA, and 2) EHRSQL (MIMIC-IV), a refashioned version of a previously established table-based EHR QA dataset. By integrating these two uni-modal resources, we successfully construct a multi-modal EHR QA dataset that necessitates both uni-modal and cross-modal reasoning. To address the unique challenges of multi-modal questions within EHRs, we propose a NeuralSQL-based strategy equipped with an external VQA API. This pioneering endeavor enhances engagement with multi-modal EHR sources and we believe that our dataset can catalyze advances in real-world medical scenarios such as clinical decision-making and research. EHRXQA is available at https://github.com/baeseongsu/ehrxqa.

chest x-ray study, dataset, template, (14 more...)

arXiv.org Artificial Intelligence

2310.18652

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

PokeMQA: Programmable knowledge editing for Multi-hop Question Answering

Gu, Hengrui, Zhou, Kaixiong, Han, Xiaotian, Liu, Ninghao, Wang, Ruobing, Wang, Xin

arXiv.org Artificial IntelligenceDec-23-2023

Multi-hop question answering (MQA) is one of the challenging tasks to evaluate machine's comprehension and reasoning abilities, where large language models (LLMs) have widely achieved the human-comparable performance. Due to the dynamics of knowledge facts in real world, knowledge editing has been explored to update model with the up-to-date facts while avoiding expensive re-training or fine-tuning. Starting from the edited fact, the updated model needs to provide cascading changes in the chain of MQA. The previous art simply adopts a mix-up prompt to instruct LLMs conducting multiple reasoning tasks sequentially, including question decomposition, answer generation, and conflict checking via comparing with edited facts. However, the coupling of these functionally-diverse reasoning tasks inhibits LLMs' advantages in comprehending and answering questions while disturbing them with the unskilled task of conflict checking. We thus propose a framework, Programmable knowledge editing for Multi-hop Question Answering (PokeMQA), to decouple the jobs. Specifically, we prompt LLMs to decompose knowledge-augmented multi-hop question, while interacting with a detached trainable scope detector to modulate LLMs behavior depending on external conflict signal. The experiments on three LLM backbones and two benchmark datasets validate our superiority in knowledge editing of MQA, outperforming all competitors by a large margin in almost all settings and consistently producing reliable reasoning process.

knowledge editing, multi-hop question, pokemqa, (11 more...)

arXiv.org Artificial Intelligence

2312.15194

Country:

Europe > United Kingdom (0.46)
Oceania > Australia (0.04)
North America > United States > Texas (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

Zhu, Wang, Thomason, Jesse, Jia, Robin

arXiv.org Artificial IntelligenceDec-23-2023

We train a language model (LM) to robustly answer multistep questions by generating and answering sub-questions. We propose Chain-of-Questions, a framework that trains a model to generate sub-questions and sub-answers one at a time by leveraging human annotated question decomposition meaning representation (QDMR). The key technical challenge is that QDMR only contains sub-questions but not answers to those sub-questions, so we treat sub-answers as latent variables and optimize them using a novel dynamic mixture of Hard-EM and MAPO. Chain-of-Questions greatly outperforms strong neuro-symbolic methods by 9.0 F1 on DROP contrast set, and outperforms GPT-3.5 by 24.3 F1 on HOTPOTQA adversarial set, thus demonstrating the effectiveness and robustness of our framework.

otpot qa, qdmr, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2305.14901

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Africa > Togo (0.05)
North America > United States > Montana (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry:

Government (0.47)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback