CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses

Neural Information Processing Systems 

The rapid progress in Large Language Models (LLMs) poses potential risks such as generating unethical content. Assessing the values embedded in LLMs' generated responses can help expose their misalignment, but this relies on reference-free value evaluators, e.g.