Improving Factual Accuracy of Neural Table-to-Text Output by Addressing Input Problems in ToTTo

Sundararajan, Barkavi, Sripada, Somayajulu, Reiter, Ehud

Apr-5-2024–arXiv.org Artificial Intelligence

Neural Table-to-Text models tend to hallucinate, producing texts that contain factual errors. We investigate whether such errors in the output can be traced back to problems with the input. We manually annotated 1,837 texts generated by multiple models in the politics domain of the ToTTo dataset. We identify the input problems that are responsible for many output errors and show that fixing these inputs reduces factual errors by between 52% and 76% (depending on the model). In addition, we observe that models struggle in processing tabular inputs that are structured in a non-standard way, particularly when the input lacks distinct row and column values or when the column headers are not correctly mapped to corresponding values.

computational linguistic, correction, input problem, (16 more...)

arXiv.org Artificial Intelligence

Apr-5-2024

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States
  - Alaska (0.04)
  - South Dakota (0.04)
  - Virginia (0.04)
  - Minnesota (0.04)
  - California (0.04)
  - Kansas (0.04)
  - Pennsylvania > Philadelphia County
    - Philadelphia (0.04)
  - Washington > King County
    - Seattle (0.04)
  - Maine > Kennebec County
    - Waterville (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe
  - United Kingdom > Scotland
    - City of Aberdeen > Aberdeen (0.04)
  - Spain
    - Galicia > A Coruña Province
      - Santiago de Compostela (0.04)
    - Catalonia > Barcelona Province
      - Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Germany > Saarland
    - Saarbrücken (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
- Asia
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Personal > Obituary (0.46)

Industry:
- Government
  - Voting & Elections (1.00)
  - Regional Government > North America Government
    - United States Government (1.00)

Technology:
- Information Technology
  - Communications (0.68)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Large Language Model (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found