Goto

Collaborating Authors

 Large Language Model


Capturing Row and Column Semantics in Transformer Based Question Answering over Tables

arXiv.org Artificial Intelligence

Transformer based architectures are recently used for the task of answering questions over tables. In order to improve the accuracy on this task, specialized pre-training techniques have been developed and applied on millions of open-domain web tables. In this paper, we propose two novel approaches demonstrating that one can achieve superior performance on table QA task without even using any of these specialized pre-training techniques. The first model, called RCI interaction, leverages a transformer based architecture that independently classifies rows and columns to identify relevant cells. While this model yields extremely high accuracy at finding cell values on recent benchmarks, a second model we propose, called RCI representation, provides a significant efficiency advantage for online QA systems over tables by materializing embeddings for existing tables. Experiments on recent benchmarks prove that the proposed methods can effectively locate cell values on tables (up to ~98% Hit@1 accuracy on WikiSQL lookup questions). Also, the interaction model outperforms the state-of-the-art transformer based approaches, pre-trained on very large table corpora (TAPAS and TaBERT), achieving ~3.4% and ~18.86% additional precision improvement on the standard WikiSQL benchmark.


Scary A.I. more intelligent than you

#artificialintelligence

GPT-3 (Generative Pre-trained Transformer 3), is an artificial intelligence language generator that uses deep learning to produce human-like output. The high quality of its text is very difficult to distinguish from a human's. Many scientists, researchers and engineers (including Stephen Hawking and Elon Musk) have warned of A.I.'s potential dangers and called for steps to mitigate risk. And deep-learning critic Gary Marcus, has said that GPT-3's "comprehension of the world is often seriously off, which means you can never really trust what it says."


LitRPG Adventures: AI RPG Generators + Content Library

#artificialintelligence

If you want to see a sample of output, grab your FREE BOOK of samples today. You can check out some samples or Register for a Membership to begin using the LitRPG Adventures Workshop tools right away! The LitRPG Adventures Workshop generators are powered by the GPT-3 API from OpenAI, one of the largest language models in the world. Yes, I got access to a supercomputer and decided to teach it D&D. Payment is done through Paypal or Stripe and is completely safe.


Literature review on vulnerability detection using NLP technology

arXiv.org Artificial Intelligence

Vulnerability detection has always been the most important task in the field of software security. With the development of technology, in the face of massive source code, automated analysis and detection of vulnerabilities has become a current research hotspot. For special text files such as source code, using some of the hottest NLP technologies to build models and realize the automatic analysis and detection of source code has become one of the most anticipated studies in the field of vulnerability detection. This article does a brief survey of some recent new documents and technologies, such as CodeBERT, and summarizes the previous technologies.


Attribute-Modulated Generative Meta Learning for Zero-Shot Classification

arXiv.org Artificial Intelligence

Zero-shot learning (ZSL) aims to transfer knowledge from seen classes to semantically related unseen classes, which are absent during training. The promising strategies for ZSL are to synthesize visual features of unseen classes conditioned on semantic side information and to incorporate meta-learning to eliminate the model's inherent bias towards seen classes. Existing meta generative approaches pursue a common model shared across task distributions; in contrast, we aim to construct a generative network adaptive to task characteristics. To this end, we propose the Attribute-Modulated generAtive meta-model for Zero-shot learning (AMAZ). Our model consists of an attribute-aware modulation network and an attribute-augmented generative network. Given unseen classes, the modulation network adaptively modulates the generator by applying task-specific transformations so that the generative network can adapt to highly diverse tasks. Our empirical evaluations on four widely-used benchmarks show that AMAZ improves state-of-the-art methods by 3.8% and 5.1% in ZSL and generalized ZSL settings, respectively, demonstrating the superiority of our method.


Why AI That Teaches Itself to Achieve a Goal Is the Next Big Thing

#artificialintelligence

Lee Sedol, a world-class Go Champion, was flummoxed by the 37th move Deepmind's AlphaGo made in the second match of the famous 2016 series. So flummoxed that it took him nearly 15 minutes to formulate a response. The move was strange to other experienced Go players as well, with one commentator suggesting it was a mistake. In fact, it was a canonical example of an artificial intelligence algorithm learning something that seemed to go beyond just pattern recognition in data -- learning something strategic and even creative. Indeed, beyond just feeding the algorithm past examples of Go champions playing games, Deepmind developers trained AlphaGo by having it play many millions of matches against itself.


OpenAI GPT leaking your data

#artificialintelligence

In this series around GPT language model, we will focus on the paper "Extract Training Data from Large Language Models" The authors want to show that they can extract verbatim data from a language model such as GPT-2. More interestingly, they explain that they can extract verbatim that have appeared only a few times in the training data from the model itself. Naturally, that can be very dangerous if you own a company and you are using customers' data to train a language model. In their own words, "the paper demonstrates that (…), an adversary can perform a training data extraction attack to recover individual training examples by querying the language model." Who would want to risk leaking private information?


A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

arXiv.org Artificial Intelligence

Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.


Revisiting Document Representations for Large-Scale Zero-Shot Learning

arXiv.org Artificial Intelligence

Zero-shot learning aims to recognize unseen objects using their semantic representations. Most existing works use visual attributes labeled by humans, not suitable for large-scale applications. In this paper, we revisit the use of documents as semantic representations. We argue that documents like Wikipedia pages contain rich visual information, which however can easily be buried by the vast amount of non-visual sentences. To address this issue, we propose a semi-automatic mechanism for visual sentence extraction that leverages the document section headers and the clustering structure of visual sentences. The extracted visual sentences, after a novel weighting scheme to distinguish similar classes, essentially form semantic representations like visual attributes but need much less human effort. On the ImageNet dataset with over 10,000 unseen classes, our representations lead to a 64% relative improvement against the commonly used ones.


Researchers Rank These Artificial Intelligence Labs As The Best In World

#artificialintelligence

Artificial intelligence is one of the most revolutionary technologies of our time, which is advancing as each day goes by. AI labs contribute to these advancements by housing scientists and researchers under one roof to study this disruptive technology for further developments. While there are quite a few AI labs across the globe, artificial intelligence researchers go perplexed when people ask them to rate the top labs in the world. And rightfully so, because they're all unique in the way they work. While every lab focuses on different domains of artificial intelligence, commercial AI labs like Google, Facebook, Amazon, Apple, and Microsoft, the U.S Big Tech, have set up dedicated AI labs too.