Compilation and Optimizations for Efficient Machine Learning on Embedded Systems