Prioritizing High-Consequence Biological Capabilities in Evaluations of Artificial Intelligence Models