Measuring the Inconsistency of Large Language Models in Preferential Ranking

Open in new window