Solving Machine Learning Performance Anti-Patterns: a Systematic Approach

#artificialintelligence 

These principles are in rough order of priority, and like all guidelines there are times they should be broken. Next we'll take a tour through some major patterns of suboptimal performance -- many of which map directly to violations of these principles. Machine learning systems show distinct patterns of resource consumption, and each of these patterns requires a different approach to improving performance. Real-world systems usually exhibit several different patterns in different parts of the inference pipeline so quite often we'll need to apply multiple of the approaches below. For example, post-processing logic is highly prone to being CPU compute bound or synchronization bound, whereas the backbone of vision models are often GPU compute bound.