Benchmarking Bias in Large Language Models during Role-Playing