VLSI Mask Optimization: From Shallow To Deep Learning