Synthetically generated text for supervised text analysis
–arXiv.org Artificial Intelligence
This article proposes a partial solution to these three issues, in the form of controlled generation of synthetic text with large language models. I provide a conceptual overview of text generation, guidance on when researchers should prefer different techniques for generating synthetic text, a discussion of ethics, and a simple technique for improving the quality of synthetic text. I demonstrate the usefulness of synthetic text with three applications: generating synthetic tweets describing the fighting in Ukraine, synthetic news articles describing specified political events for training an event detection system, and a multilingual corpus of populist manifesto statements for training a sentence-level populism classifier.
arXiv.org Artificial Intelligence
Mar-28-2023
- Country:
- South America
- Brazil > São Paulo (0.04)
- Colombia > Bogotá D.C.
- Bogotá (0.04)
- North America
- Europe
- Italy (0.05)
- Austria > Vienna (0.04)
- Germany > Berlin (0.04)
- Poland (0.04)
- Slovakia (0.04)
- Greece (0.04)
- Latvia (0.04)
- Western Europe (0.04)
- Romania (0.04)
- Ireland (0.04)
- Czechia > Prague (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Croatia > Zagreb County
- Zagreb (0.04)
- Belarus > Minsk Region
- Minsk (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Norway > Eastern Norway
- Oslo (0.04)
- Serbia > Central Serbia
- Belgrade (0.04)
- Ukraine
- Kyiv Oblast > Kyiv (0.04)
- Kherson Oblast > Kherson (0.04)
- Kharkiv Oblast > Kharkiv (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Asia
- Russia (0.28)
- India > Gujarat (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Syria > Damascus Governorate
- Damascus (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Lebanon > Beirut Governorate
- Beirut (0.04)
- Israel > Jerusalem District
- Jerusalem (0.04)
- Iran > Tehran Province
- Tehran (0.04)
- UAE > Abu Dhabi Emirate
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- China > Beijing
- Beijing (0.04)
- Bangladesh > Dhaka Division
- Dhaka District > Dhaka (0.04)
- Afghanistan > Kabul Province
- Kabul (0.04)
- Africa
- Sudan
- Khartoum State > Khartoum (0.04)
- Khartoum (0.04)
- Nigeria > Federal Capital Territory
- Abuja (0.04)
- Middle East > Egypt
- Cairo Governorate > Cairo (0.04)
- Kenya > Nairobi City County
- Nairobi (0.04)
- Democratic Republic of the Congo > Kinshasa Province
- Kinshasa (0.04)
- Sudan
- South America
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Media > News (0.94)
- Banking & Finance (0.92)
- Law Enforcement & Public Safety
- Terrorism (1.00)
- Crime Prevention & Enforcement (1.00)
- Government
- Military (1.00)
- Regional Government > Europe Government (0.93)
- Immigration & Customs (0.67)
- Technology: