AITopics

2305.03088

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(10 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos

Yu, Zhou, Zheng, Lixiang, Zhao, Zhou, Wu, Fei, Fan, Jianping, Ren, Kui, Yu, Jun

Building benchmarks to systemically analyze different capabilities of video question answering (VideoQA) models is challenging yet crucial. Existing benchmarks often use non-compositional simple questions and suffer from language biases, making it difficult to diagnose model weaknesses incisively. A recent benchmark AGQA poses a promising paradigm to generate QA pairs automatically from pre-annotated scene graphs, enabling it to measure diverse reasoning abilities with granular control. However, its questions have limitations in reasoning about the fine-grained semantics in videos as such information is absent in its scene graphs. To this end, we present ANetQA, a large-scale benchmark that supports fine-grained compositional reasoning over the challenging untrimmed videos from ActivityNet. Similar to AGQA, the QA pairs in ANetQA are automatically generated from annotated video scene graphs. The fine-grained properties of ANetQA are reflected in the following: (i) untrimmed videos with fine-grained semantics; (ii) spatio-temporal scene graphs with fine-grained taxonomies; and (iii) diverse questions generated from fine-grained templates. ANetQA attains 1.4 billion unbalanced and 13.4 million balanced QA pairs, which is an order of magnitude larger than AGQA with a similar number of videos. Comprehensive experiments are performed for state-of-the-art methods. The best model achieves 44.5% accuracy while human performance tops out at 84.5%, leaving sufficient room for improvement.

benchmark, natural language, question answering, (19 more...)

2305.02519

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Sports (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Ivanova, Sardana, Andreassen, Fredrik Aas, Jentoft, Matias, Wold, Sondre, Øvrelid, Lilja

NorQuAD: Norwegian Question Answering Dataset

In the following we detail the dataset creation (section 3), where we describe the passage selection In this paper, we present NorQuAD: and question-answer generation, present the first Norwegian question answering relevant statistics for the dataset and provide an dataset for machine reading comprehension.

artificial intelligence, natural language, question answering, (17 more...)

2305.01957

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Norway > Eastern Norway > Oslo (0.05)
Europe > Sweden > Östergötland County > Linköping (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Product Question Answering in E-Commerce: A Survey

Deng, Yang, Zhang, Wenxuan, Yu, Qian, Lam, Wai

Product question answering (PQA), aiming to automatically provide instant responses to customer's questions in E-Commerce platforms, has drawn increasing attention in recent years. Compared with typical QA problems, PQA exhibits unique challenges such as the subjectivity and reliability of user-generated contents in E-commerce platforms. Therefore, various problem settings and novel methods have been proposed to capture these special characteristics. In this paper, we aim to systematically review existing research efforts on PQA. Specifically, we categorize PQA studies into four problem settings in terms of the form of provided answers. We analyze the pros and cons, as well as present existing datasets and evaluation protocols for each setting. We further summarize the most significant challenges that characterize PQA from general QA applications and discuss their corresponding solutions. Finally, we conclude this paper by providing the prospect on several future directions.

information, machine learning, question answering, (19 more...)

2302.08092

Country:

Asia > China > Hong Kong (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > e-Commerce (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

AttenWalker: Unsupervised Long-Document Question Answering via Attention-based Graph Walking

Nie, Yuxiang, Huang, Heyan, Wei, Wei, Mao, Xian-Ling

Annotating long-document question answering (long-document QA) pairs is time-consuming and expensive. To alleviate the problem, it might be possible to generate long-document QA pairs via unsupervised question answering (UQA) methods. However, existing UQA tasks are based on short documents, and can hardly incorporate long-range information. To tackle the problem, we propose a new task, named unsupervised long-document question answering (ULQA), aiming to generate high-quality long-document QA instances in an unsupervised manner. Besides, we propose AttenWalker, a novel unsupervised method to aggregate and generate answers with long-range dependency so as to construct long-document QA pairs. Specifically, AttenWalker is composed of three modules, i.e., span collector, span linker and answer aggregator. Firstly, the span collector takes advantage of constituent parsing and reconstruction loss to select informative candidate spans for constructing answers. Secondly, by going through the attention graph of a pre-trained long-document model, potentially interrelated text spans (that might be far apart) could be linked together via an attention-walking algorithm. Thirdly, in the answer aggregator, linked spans are aggregated into the final answer via the mask-filling ability of a pre-trained model. Extensive experiments show that AttenWalker outperforms previous methods on Qasper and NarrativeQA. In addition, AttenWalker also shows strong performance in the few-shot learning setting.

machine learning, natural language, question answering, (16 more...)

2305.02235

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

arXiv.org Artificial IntelligenceMay-2-2023

Huatuo-26M, a Large-scale Chinese Medical QA Dataset

Li, Jianquan, Wang, Xidong, Wu, Xiangbo, Zhang, Zhiyi, Xu, Xiaolong, Fu, Jie, Tiwari, Prayag, Wan, Xiang, Wang, Benyou

In this paper, we release a largest ever medical Question Answering (QA) dataset with 26 million QA pairs. We benchmark many existing approaches in our dataset in terms of both retrieval and generation. Experimental results show that the existing models perform far lower than expected and the released dataset is still challenging in the pre-trained language model era. Moreover, we also experimentally show the benefit of the proposed dataset in many aspects: (i) trained models for other QA datasets in a zero-shot fashion; and (ii) as external knowledge for retrieval-augmented generation (RAG); and (iii) improving existing pre-trained language models by using the QA pairs as a pre-training corpus in continued training manner. We believe that this dataset will not only contribute to medical research but also facilitate both the patients and clinical doctors. See \url{https://github.com/FreedomIntelligence/Huatuo-26M}.

large language model, machine learning, question answering, (21 more...)

2305.01526

Country:

North America > United States (0.28)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceMay-1-2023

Conditional Generation with a Question-Answering Blueprint

Narayan, Shashi, Maynez, Joshua, Amplayo, Reinald Kim, Ganchev, Kuzman, Louis, Annie, Huot, Fantine, Sandholm, Anders, Das, Dipanjan, Lapata, Mirella

The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. Our work proposes a new conceptualization of text plans as a sequence of question-answer (QA) pairs. We enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for both content selection (i.e.,~what to say) and planning (i.e.,~in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output.

computational linguistic, machine learning, question answering, (21 more...)

2207.00397

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Africa > Ethiopia (0.05)
Europe > Italy > Tuscany > Florence (0.04)
(29 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Chen, Yu, Wu, Lingfei, Zaki, Mohammed J.

Toward Subgraph-Guided Knowledge Graph Question Generation with Graph Neural Networks

arXiv.org Artificial IntelligenceApr-30-2023

Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers. Previous works mostly focus on a simple setting which is to generate questions from a single KG triple. In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers. In addition, most of previous works built on either RNN-based or Transformer based models to encode a linearized KG sugraph, which totally discards the explicit structure information of a KG subgraph. To address this issue, we propose to apply a bidirectional Graph2Seq model to encode the KG subgraph. Furthermore, we enhance our RNN decoder with node-level copying mechanism to allow directly copying node attributes from the KG subgraph to the output question. Both automatic and human evaluation results demonstrate that our model achieves new state-of-the-art scores, outperforming existing methods by a significant margin on two QG benchmarks. Experimental results also show that our QG model can consistently benefit the Question Answering (QA) task as a mean of data augmentation.

machine learning, natural language, question answering, (19 more...)

doi: 10.1109/TNNLS.2023.3264519

2004.06015

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.05)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.05)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

He, Xuehai, Wang, Xin Eric

Multimodal Graph Transformer for Multimodal Question Answering

arXiv.org Artificial IntelligenceApr-30-2023

Despite the success of Transformer models in vision and language tasks, they often learn knowledge from enormous data implicitly and cannot utilize structured input data directly. On the other hand, structured learning approaches such as graph neural networks (GNNs) that integrate prior information can barely compete with Transformer models. In this work, we aim to benefit from both worlds and propose a novel Multimodal Graph Transformer for question answering tasks that requires performing reasoning across multiple modalities. We introduce a graph-involved plug-and-play quasi-attention mechanism to incorporate multimodal graph information, acquired from text and visual data, to the vanilla self-attention as effective prior. In particular, we construct the text graph, dense region graph, and semantic graph to generate adjacency matrices, and then compose them with input vision and language features to perform downstream reasoning. Such a way of regularizing self-attention with graph information significantly improves the inferring ability and helps align features from different modalities. We validate the effectiveness of Multimodal Graph Transformer over its Transformer baselines on GQA, VQAv2, and MultiModalQA datasets.

machine learning, natural language, question answering, (19 more...)

2305.00581

Country: North America > United States > New York (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.75)

Urban, Matthias, Binnig, Carsten

Towards Multi-Modal DBMSs for Seamless Querying of Texts and Tables

arXiv.org Artificial IntelligenceApr-28-2023

In this paper, we propose Multi-Modal Databases (MMDBs), which is a new class of database systems that can seamlessly query text and tables using SQL. To enable seamless querying of textual data using SQL in an MMDB, we propose to extend relational databases with so-called multi-modal operators (MMOps) which are based on the advances of recent large language models such as GPT-3. The main idea of MMOps is that they allow text collections to be treated as tables without the need to manually transform the data. As we show in our evaluation, our MMDB prototype can not only outperform state-of-the-art approaches such as text-to-table in terms of accuracy and performance but it also requires significantly less training data to fine-tune the model for an unseen text collection.

large language model, machine learning, question answering, (18 more...)

2304.13559

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(16 more...)

Genre:

Research Report (0.70)
Overview > Innovation (0.34)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)