Knowledge boosting during low-latency inference