Researchers propose game-based benchmark for AI's commonsense reasoning