Synthetic Text Generation using Hypergraph Representations
–arXiv.org Artificial Intelligence
Synthetic text plays a vital role in data augmentation, model robustness, privacy preservation and scenario analysis. It is usually formulated as conditional text generation where a given source document is transformed using substitutions, paraphrasing, back translation, mixups etc. [1] to obtain a modified document. We argue that conditioning on the unstructured text limits the ability to mix text fragments coherently and produces transformations that are not confined to essential information, a critical necessity for long-form text. Furthermore, explaining the generated text becomes challenging, particularly detecting hallucinations [2]. We propose here a decompose and expand technique to generate synthetic text, where the semantic frames [3] of a source document are first extracted, and this compact interim form is used to generate the transformed text.
arXiv.org Artificial Intelligence
Dec-2-2023
- Country:
- Oceania > Australia
- North America > United States
- New York (0.04)
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- United Kingdom > England
- Genre:
- Research Report (0.64)
- Industry:
- Banking & Finance (1.00)
- Law > Litigation (0.46)
- Technology: