Sentence Smith: Formally Controllable Text Transformation and its Application to Evaluation of Text Embedding Models
Li, Hongji, Michail, Andrianos, Gubelmann, Reto, Clematide, Simon, Opitz, Juri
–arXiv.org Artificial Intelligence
We propose the Sentence Smith framework that enables controlled and specified manipulation of text meaning. It consists of three main steps: 1. Parsing a sentence into a semantic graph, 2. Applying human-designed semantic manipulation rules, and 3. Generating text from the manipulated graph. A final filtering step (4.) ensures the validity of the applied transformation. To demonstrate the utility of Sentence Smith in an application study, we use it to generate hard negative pairs that challenge text embedding models. Since the controllable generation makes it possible to clearly isolate different types of semantic shifts, we can gain deeper insights into the specific strengths and weaknesses of widely used text embedding models, also addressing an issue in current benchmarking where linguistic phenomena remain opaque. Human validation confirms that the generations produced by Sentence Smith are highly accurate.
arXiv.org Artificial Intelligence
Feb-25-2025
- Country:
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- California > San Diego County
- San Diego (0.04)
- Washington > King County
- Mexico > Mexico City
- Mexico City (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- United Kingdom (0.14)
- Russia (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Asia
- North America
- Genre:
- Research Report (1.00)
- Workflow (0.68)
- Technology: