Investigating Test-Time Scaling with Reranking for Machine Translation