Goto

Collaborating Authors

Generation: Overviews


How to generate text: using different decoding methods for language generation with Transformers

#artificialintelligence

In recent years, there has been an increasing interest in open-ended language generation thanks to the rise of large transformer-based language models trained on millions of webpages, such as OpenAI's famous GPT2 model. The results on conditioned open-ended language generation are impressive, e.g. Besides the improved transformer architecture and massive unsupervised training data, better decoding methods have also played an important role. This blog post gives a brief overview of different decoding strategies and more importantly shows how you can implement them with very little effort using the popular transformers library! All of the following functionalities can be used for auto-regressive language generation (here a refresher).


Towards information-rich, logical text generation with knowledge-enhanced neural models

arXiv.org Artificial Intelligence

Text generation system has made massive promising progress contributed by deep learning techniques and has been widely applied in our life. However, existing end-to-end neural models suffer from the problem of tending to generate uninformative and generic text because they cannot ground input context with background knowledge. In order to solve this problem, many researchers begin to consider combining external knowledge in text generation systems, namely knowledge-enhanced text generation. The challenges of knowledge enhanced text generation including how to select the appropriate knowledge from large-scale knowledge bases, how to read and understand extracted knowledge, and how to integrate knowledge into generation process. This survey gives a comprehensive review of knowledge-enhanced text generation systems, summarizes research progress to solving these challenges and proposes some open issues and research directions.


Introduction to NLP – Towards Data Science

#artificialintelligence

Natural language processing (NLP) is an area of computer science and artificial intelligence that is concerned with the interaction between computers and humans in natural language. It is the driving force behind things like virtual assistants, speech recognition, sentiment analysis, automatic text summarization, machine translation and much more. In this post, you will learn the basics of natural language processing, dive into some of its techniques and also learn how NLP benefited from the recent advances in Deep Learning. Natural Language Processing (NLP) is the intersection of Computer Science, Linguistics and Machine Learning that is concerned with the communication between computers and humans in natural language. NLP is all about enabling computers to understand and generate human language. Applications of NLP techniques are Voice Assistants like Alexa and Siri but also things like Machine Translation and text-filtering.


Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Journal of Artificial Intelligence Research

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past two decades, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology.


Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Journal of Artificial Intelligence Research

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past two decades, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of NLP, with an emphasis on different evaluation methods and the relationships between them.


Long Text Generation via Adversarial Training with Leaked Information

AAAI Conferences

Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets(GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional MANAGER module, which takes the extracted features of current generated words and outputs a latent vector to guide the WORKER module for next-word generation.Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between MANAGER and WORKER.


Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

arXiv.org Artificial Intelligence

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.


Kalos - A Syste for Natural Language Generatio with Revision

AAAI Conferences

Using revision to produce extended natural language text through a series of drafts provides three significant advantages over a traditional natural language generation system. First, it reduces complexity through task decomposition. Second, it promotes text polishing techniques that benefit from the ability to examine generated text in the context of the underlying knowledge from which it was generated. Third, it provides a mechanism for the interaction of conceptual and stylistic decisions. Kalos is a natural language generation system that produces advanced draft quality text for a microprocessor users' guide from a knowledge base describing the microprocessor. It uses revision iteratively to polish its initial generation.


AN OVERVIEW OF THE PENMAS TEXT GENERATION SYSTEM

AAAI Conferences

The problem of programming computers to produce natural language explanations and other texts on demand is an active research area in artificial intelligence. In the past, research systems designed for this purpose have been limited by the weakness of their linguistic bases, especially their grammars, and their techniques often cannot be transferred to new knowledge domains. A new text generation system, Penman, is designed to overcome these problems and produce fluent multiparagraph text in English in response to a goal presented to the system. Penman consists of four major modules: a knowledae acauisition module which can perform domain-specific searches for knowledge relevant to a given communication goal; a text planninq module which can organize the relevant information, decide what portion to present.


Current Issues in Natural Language Generation: An Overview of the AAAI Workshop on Text Planning and Realization

AI Magazine

Text planning is one of the most rapidly growing subfields of language generation. Until the 1988 AAAI conference, no workshop has concentrated on text planning and its relationship to realiza-tion. This report is a summary of that workshop.