A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Open in new window