Cross-replication Reliability -- An Empirical Approach to Interpreting Inter-rater Reliability