Can semi-supervised learning use all the data effectively? A lower bound perspective