Understanding temperature tuning in energy-based models

Fields, Peter W, Ngampruetikorn, Vudtiwat, Schwab, David J, Palmer, Stephanie E

arXiv.org Artificial Intelligence 

Energy-based models trained on evolutionary data can now generate novel protein sequences with custom functions [38]. A crucial, yet poorly understood, step in these successes is the use of an artificially low sampling "temperature" to produce functional sequences from the trained model. This adjustment is often the deciding factor between generating functional enzymes and inert polypeptides. A fundamental question arises as to what necessitates temperature tuning and what it reveals about the space of functional proteins and the limits of the models trained on finite data. Temperature tuning is a broadly used heuristic across machine learning contexts, used to improve training [16, 33, 34], generalization/generative performance [14, 45, 47, 48], and energy-landscape dynamics for memory retrieval [35]. It follows the basic intuition that one can navigate the trade-off between fidelity (producing believable, high-probability outputs at low temperature) and diversity (exploring a wide range of novel outputs at high temperature). Despite its widespread use, this practice lacks a principled, quantitative explanation and has not been systematically connected to known issues of the fitting procedure--particularly how it connects to fundamental limits in the learning process, such as biases introduced by training on finite data [5, 9, 10, 21, 22, 41].