Mimicking How Humans Interpret Out-of-Context Sentences Through Controlled Toxicity Decoding

Mar-11-2025–arXiv.org Artificial Intelligence

Interpretations of a single sentence can vary, particularly when its context is lost. This paper aims to simulate how readers perceive content with varying toxicity levels by generating diverse interpretations of out-of-context sentences. By modeling toxicity, we can anticipate misunderstandings and reveal hidden toxic meanings. Our proposed decoding strategy explicitly controls toxicity in the set of generated interpretations by (i) aligning interpretation toxicity with the input, (ii) relaxing toxicity constraints for more toxic input sentences, and (iii) promoting diversity in toxicity levels within the set of generated interpretations. Experimental results show that our method improves alignment with human-written interpretations in both syntax and semantics while reducing model prediction uncertainty.

input sentence, interpretation, toxicity, (14 more...)

arXiv.org Artificial Intelligence

Mar-11-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States > Michigan
    - Washtenaw County > Ann Arbor (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
  - Belgium > Flanders
    - Flemish Brabant > Leuven (0.04)
- Asia
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found