Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

Open in new window