Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
–Neural Information Processing Systems
The predominant de facto paradigm of testing ML models relies on either using only held-out data to compute aggregate evaluation metrics or by assessing the performance on different subgroups.
Neural Information Processing Systems
Nov-20-2025, 04:08:30 GMT
- Country:
- Europe > United Kingdom
- England
- Cambridgeshire > Cambridge (0.04)
- East Midlands (0.04)
- West Midlands (0.04)
- Scotland (0.04)
- Wales (0.04)
- England
- North America > United States (0.04)
- Europe > United Kingdom
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Education (1.00)
- Health & Medicine
- Consumer Health (0.92)
- Diagnostic Medicine (0.68)
- Therapeutic Area > Oncology (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (1.00)
- Statistical Learning (0.93)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence