ConStat: Performance-Based Contamination Detection in Large Language Models