Confidence Calibration in Large Language Model-Based Entity Matching
Kamsteeg, Iris, Cardenas-Cartagena, Juan, van Beers, Floris, Holt, Gineke ten, Tashu, Tsegaye Misikir, Valdenegro-Toro, Matias
–arXiv.org Artificial Intelligence
This research aims to explore the intersection of Large Language Models and confidence calibration in Entity Matching. To this end, we perform an empirical study to compare baseline RoBERTa confidences for an Entity Matching task against confidences that are calibrated using Temperature Scaling, Monte Carlo Dropout and Ensembles. We use the Abt-Buy, DBLP-ACM, iTunes-Amazon and Company datasets. The findings indicate that the proposed modified RoBERTa model exhibits a slight overconfidence, with Expected Calibration Error scores ranging from 0.0043 to 0.0552 across datasets. We find that this overconfidence can be mitigated using Temperature Scaling, reducing Expected Calibration Error scores by up to 23.83%.
arXiv.org Artificial Intelligence
Oct-17-2025
- Country:
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- Netherlands (0.04)
- Switzerland (0.04)
- Denmark > Capital Region
- North America > United States
- Hawaii > Honolulu County
- Honolulu (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Hawaii > Honolulu County
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine (0.67)
- Technology: