B4: Towards Optimal Assessment of Plausible Code Solutions with Plausible Tests