Tighter Bounds on the Information Bottleneck with Application to Deep Learning