Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences

Open in new window