Forecasting Rare Language Model Behaviors