On the Reliability of Sampling Strategies in Offline Recommender Evaluation
Pereira, Bruno L., Said, Alan, Santos, Rodrygo L. T.
–arXiv.org Artificial Intelligence
Offline evaluation plays a central role in benchmarking recommender systems when online testing is impractical or risky. However, it is susceptible to two key sources of bias: exposure bias, where users only interact with items they are shown, and sampling bias, introduced when evaluation is performed on a subset of logged items rather than the full catalog. While prior work has proposed methods to mitigate sampling bias, these are typically assessed on fixed logged datasets rather than for their ability to support reliable model comparisons under varying exposure conditions or relative to true user preferences. In this paper, we investigate how different combinations of logging and sampling choices affect the reliability of offline evaluation. Using a fully observed dataset as ground truth, we systematically simulate diverse exposure biases and assess the reliability of common sampling strategies along four dimensions: sampling resolution (recommender model separability), fidelity (agreement with full evaluation), robustness (stability under exposure bias), and predictive power (alignment with ground truth). Our findings highlight when and how sampling distorts evaluation outcomes and offer practical guidance for selecting strategies that yield faithful and robust offline comparisons.
arXiv.org Artificial Intelligence
Aug-12-2025
- Country:
- Asia
- China > Hong Kong (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Europe
- Switzerland > Geneva
- Geneva (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Czechia
- Prague (0.05)
- South Moravian Region > Brno (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- France > Auvergne-Rhône-Alpes
- Greece > Attica
- Athens (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Galicia > Madrid (0.04)
- Catalonia > Barcelona Province
- Netherlands > North Holland
- Amsterdam (0.04)
- Austria > Vienna (0.14)
- Switzerland > Geneva
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- United States
- New York > New York County
- New York City (0.17)
- District of Columbia > Washington (0.04)
- Washington > King County
- Seattle (0.04)
- Georgia > Fulton County
- Atlanta (0.14)
- Virginia > Arlington County
- Arlington (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Texas > Harris County
- Houston (0.04)
- New York > New York County
- Canada
- South America > Brazil
- Minas Gerais > Belo Horizonte (0.04)
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Technology: