Measuring Value Understanding in Language Models through Discriminator-Critique Gap