An Introduction to Bi-level Optimization: Foundations and Applications in Signal Processing and Machine Learning