Simpler becomes Harder: Do LLMs Exhibit a Coherent Behavior on Simplified Corpora?
Anschütz, Miriam, Mosca, Edoardo, Groh, Georg
–arXiv.org Artificial Intelligence
Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs. We conduct experiments using 11 pre-trained models, including BERT and OpenAI's GPT 3.5, across six datasets spanning three languages. Additionally, we conduct a detailed analysis of the correlation between prediction change rates and simplification types/strengths. Our findings reveal alarming inconsistencies across all languages and models. If not promptly addressed, simplified inputs can be easily exploited to craft zero-iteration model-agnostic adversarial attacks with success rates of up to 50%
arXiv.org Artificial Intelligence
Apr-10-2024
- Country:
- North America
- Dominican Republic (0.04)
- United States > Colorado (0.04)
- Canada > Ontario
- Toronto (0.05)
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- United Kingdom > England
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Information Technology > Security & Privacy (0.35)
- Technology: