Text Generation: A Systematic Literature Review of Tasks, Evaluation, and Challenges
Becker, Jonas, Wahle, Jan Philip, Gipp, Bela, Ruas, Terry
–arXiv.org Artificial Intelligence
Text generation has become more accessible than ever, and the increasing interest in these systems, especially those using large language models, has spurred an increasing number of related publications. We provide a systematic literature review comprising 244 selected papers between 2017 and 2024. This review categorizes works in text generation into five main tasks: open-ended text generation, summarization, translation, paraphrasing, and question answering. For each task, we review their relevant characteristics, sub-tasks, and specific challenges (e.g., missing datasets for multi-document summarization, coherence in story generation, and complex reasoning for question answering). Additionally, we assess current approaches for evaluating text generation systems and ascertain problems with current metrics. Our investigation shows nine prominent challenges common to all tasks and sub-tasks in recent text generation publications: bias, reasoning, hallucinations, misuse, privacy, interpretability, transparency, datasets, and computing. We provide a detailed analysis of these challenges, their potential solutions, and which gaps still require further engagement from the community. This systematic literature review targets two main audiences: early career researchers in natural language processing looking for an overview of the field and promising research directions, as well as experienced researchers seeking a detailed view of tasks, evaluation methodologies, open challenges, and recent mitigation strategies.
arXiv.org Artificial Intelligence
May-24-2024
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Texas > Travis County
- Austin (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > Denver County
- Denver (0.04)
- Ohio > Franklin County
- Columbus (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Washington > King County
- Seattle (0.04)
- Alaska > Anchorage Municipality
- Anchorage (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- New York > New York County
- New York City (0.04)
- Texas > Travis County
- Canada
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Europe
- Netherlands (0.04)
- United Kingdom (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany
- Lower Saxony > Gottingen (0.14)
- Berlin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Singapore (0.04)
- Macao (0.04)
- India (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- China
- Hong Kong (0.04)
- Jiangsu Province > Yancheng (0.04)
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.87)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Education (0.68)
- Media (0.67)
- Technology:
- Information Technology > Artificial Intelligence
- Natural Language
- Text Processing (1.00)
- Large Language Model (1.00)
- Generation (1.00)
- Machine Translation (0.94)
- Chatbot (0.93)
- Question Answering (0.86)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Information Technology > Artificial Intelligence