Question Answering
Parameter-Efficient Abstractive Question Answering over Tables or Text
Pal, Vaishali, Kanoulas, Evangelos, de Rijke, Maarten
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and generate natural answers to user queries. Today, memory intensive pre-trained language models are adapted to downstream tasks such as QA by fine-tuning the model on QA data in a specific modality like unstructured text or structured tables. To avoid training such memory-hungry models while utilizing a uniform architecture for each modality, parameter-efficient adapters add and train small task-specific bottle-neck layers between transformer layers. In this work, we study parameter-efficient abstractive QA in encoder-decoder models over structured tabular data and unstructured textual data using only 1.5% additional parameters for each modality. We also ablate over adapter layers in both encoder and decoder modules to study the efficiency-performance trade-off and demonstrate that reducing additional trainable parameters down to 0.7%-1.0% leads to comparable results. Our models out-perform current state-of-the-art models on tabular QA datasets such as Tablesum and FeTaQA, and achieve comparable performance on a textual QA dataset such as NarrativeQA using significantly less trainable parameters than fine-tuning.
Introducing Voice Search Experience at Booking.com
Communication is a natural part of our everyday lives. People interact using voice and text, forming sentences to express what they desire. And yet, most of the search and discovery patterns out there rely on menu items and filter facets. Building on our mission at Booking.com: "Making it easier for everyone to experience the world", the ML & AI Product teams based in Tel Aviv decided to challenge the conventional search patterns by allowing the most natural way for everyone to communicate: using their voice. This is the story of how we built a native in-app voice assistant at Booking.com, and as far as I know, the first voice search available today by a global online travel company.
Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment
Li, Zichao, Sharma, Prakhar, Lu, Xing Han, Cheung, Jackie C. K., Reddy, Siva
Most research on question answering focuses on the pre-deployment stage; i.e., building an accurate model for deployment. In this paper, we ask the question: Can we improve QA systems further \emph{post-}deployment based on user interactions? We focus on two kinds of improvements: 1) improving the QA system's performance itself, and 2) providing the model with the ability to explain the correctness or incorrectness of an answer. We collect a retrieval-based QA dataset, FeedbackQA, which contains interactive feedback from users. We collect this dataset by deploying a base QA system to crowdworkers who then engage with the system and provide feedback on the quality of its answers. The feedback contains both structured ratings and unstructured natural language explanations. We train a neural model with this feedback data that can generate explanations and re-score answer candidates. We show that feedback data not only improves the accuracy of the deployed QA system but also other stronger non-deployed systems. The generated explanations also help users make informed decisions about the correctness of answers. Project page: https://mcgill-nlp.github.io/feedbackqa/
VLSP 2021 - ViMRC Challenge: Vietnamese Machine Reading Comprehension
Van Nguyen, Kiet, Tran, Son Quoc, Nguyen, Luan Thanh, Van Huynh, Tin, Luu, Son T., Nguyen, Ngan Luu-Thuy
One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which is the task to find answers to human questions based on textual data. Existing Vietnamese datasets for MRC research concentrate solely on answerable questions. However, in reality, questions can be unanswerable for which the correct answer is not stated in the given textual data. To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2.0 for evaluating the MRC task and question answering systems for the Vietnamese language. We use UIT-ViQuAD 2.0 as a benchmark dataset for the challenge on Vietnamese MRC at the Eighth Workshop on Vietnamese Language and Speech Processing (VLSP 2021). This task attracted 77 participant teams from 34 universities and other organizations. In this article, we present details of the organization of the challenge, an overview of the methods employed by shared-task participants, and the results. The highest performances are 77.24% in F1-score and 67.43% in Exact Match on the private test set. The Vietnamese MRC systems proposed by the top 3 teams use XLM-RoBERTa, a powerful pre-trained language model based on the transformer architecture. The UIT-ViQuAD 2.0 dataset motivates researchers to further explore the Vietnamese machine reading comprehension task and related tasks such as question answering, question generation, and natural language inference.
Towards Differential Relational Privacy and its use in Question Answering
Bombari, Simone, Achille, Alessandro, Wang, Zijian, Wang, Yu-Xiang, Xie, Yusheng, Singh, Kunwar Yashraj, Appalaraju, Srikar, Mahadevan, Vijay, Soatto, Stefano
Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference is most pronounced when the data distribution is long-tailed, with many queries having only few training examples: Impeding general memorization prevents effective learning, while impeding only relational memorization still allows learning general properties of the underlying concepts. We formalize the notion of Relational Privacy (RP) and, inspired by Differential Privacy (DP), we provide a possible definition of Differential Relational Privacy (DrP). These notions can be used to describe and compute bounds on the amount of RM in a trained model. We illustrate Relational Privacy concepts in experiments with large-scale models for Question Answering.
MIT-IBM Watson AI Lab Tackles Power Grid Failures with AI
Next time your power stays on during a severe weather event, you may have a machine learning model to thank. Researchers at the MIT-IBM Watson AI Lab are using artificial intelligence to solve power grid failures. The manager of the MIT-IBM Watson AI Lab, Jie Chen, and his colleagues have developed a machine learning model that works to analyze data collected from hundreds of thousands of sensors located across the U.S. power grid. The sensors, components of what is known as synchrophasor technology, compile vast amounts of real-time data related to electric current and voltage in order to monitor the health of the grid and locate anomalies that could cause outages. Synchrophasor analysis requires intensive computational resources due to the size and real-time nature of the data streams the sensors produce.
MIT-IBM Watson AI Lab Tackles Energy Grid Failures with AI - Channel969
Subsequent time your energy stays on throughout a extreme climate occasion, you could have a machine studying mannequin to thank. Researchers on the MIT-IBM Watson AI Lab are utilizing synthetic intelligence to resolve energy grid failures. The supervisor of the MIT-IBM Watson AI Lab, Jie Chen, and his colleagues have developed a machine studying mannequin that works to investigate knowledge collected from tons of of 1000's of sensors situated throughout the U.S. energy grid. The sensors, parts of what's referred to as synchrophasor expertise, compile huge quantities of real-time knowledge associated to electrical present and voltage in an effort to monitor the well being of the grid and find anomalies that would trigger outages. Synchrophasor evaluation requires intensive computational sources as a result of dimension and real-time nature of the info streams the sensors produce.
MIT-IBM Watson AI Lab Tackles Power Grid Failures with AI
Next time your power stays on during a severe weather event, you may have a machine learning model to thank. Researchers at the MIT-IBM Watson AI Lab are using artificial intelligence to solve power grid failures. The manager of the MIT-IBM Watson AI Lab, Jie Chen, and his colleagues have developed a machine learning model that works to analyze data collected from hundreds of thousands of sensors located across the U.S. power grid. The sensors, components of what is known as synchrophasor technology, compile vast amounts of real-time data related to electric current and voltage in order to monitor the health of the grid and locate anomalies that could cause outages. Synchrophasor analysis requires intensive computational resources due to the size and real-time nature of the data streams the sensors produce.
Query Answering with Transitive and Linear-Ordered Data
Amarilli, Antoine, Benedikt, Michael, Bourhis, Pierre, Boom, Michael Vanden
We consider entailment problems involving powerful constraint languages such as frontier-guarded existential rules in which we impose additional semantic restrictions on a set of distinguished relations. We consider restricting a relation to be transitive, restricting a relation to be the transitive closure of another relation, and restricting a relation to be a linear order. We give some natural variants of guardedness that allow inference to be decidable in each case, and isolate the complexity of the corresponding decision problems. Finally we show that slight changes in these conditions lead to undecidability.
Querying Inconsistent Prioritized Data with ORBITS: Algorithms, Implementation, and Experiments
Bienvenu, Meghyn, Bourgaux, Camille
We investigate practical algorithms for inconsistency-tolerant query answering over prioritized knowledge bases, which consist of a logical theory, a set of facts, and a priority relation between conflicting facts. We consider three well-known semantics (AR, IAR and brave) based upon two notions of optimal repairs (Pareto and completion). Deciding whether a query answer holds under these semantics is (co)NP-complete in data complexity for a large class of logical theories, and SAT-based procedures have been devised for repair-based semantics when there is no priority relation, or the relation has a special structure. The present paper introduces the first SAT encodings for Pareto- and completion-optimal repairs w.r.t. general priority relations and proposes several ways of employing existing and new encodings to compute answers under (optimal) repair-based semantics, by exploiting different reasoning modes of SAT solvers. The comprehensive experimental evaluation of our implementation compares both (i) the impact of adopting semantics based on different kinds of repairs, and (ii) the relative performances of alternative procedures for the same semantics.