Reassessing Graph Linearization for Sequence-to-sequence AMR Parsing: On the Advantages and Limitations of Triple-Based Encoding
Kang, Jeongwoo, Coavoux, Maximin, Lopez, Cédric, Schwab, Didier
–arXiv.org Artificial Intelligence
Sequence-to-sequence models are widely used to train Abstract Meaning Representation (Banarescu et al., 2013, AMR) parsers. To train such models, AMR graphs have to be linearized into a one-line text format. While Penman encoding is typically used for this purpose, we argue that it has limitations: (1) for deep graphs, some closely related nodes are located far apart in the linearized text (2) Penman's tree-based encoding necessitates inverse roles to handle node re-entrancy, doubling the number of relation types to predict. To address these issues, we propose a triple-based linearization method and compare its efficiency with Penman linearization. Although triples are well suited to represent a graph, our results suggest room for improvement in triple encoding to better compete with Penman's concise and explicit representation of a nested graph structure.
arXiv.org Artificial Intelligence
May-14-2025
- Country:
- North America
- Dominican Republic (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Netherlands (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France
- Grand Est > Meurthe-et-Moselle
- Nancy (0.04)
- Auvergne-Rhône-Alpes > Isère
- Grenoble (0.05)
- Grand Est > Meurthe-et-Moselle
- Bulgaria > Sofia City Province
- Sofia (0.05)
- Asia
- China (0.06)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- North America
- Genre:
- Research Report > New Finding (0.69)
- Technology: