The Cambridge Law Corpus: A Dataset for Legal AI Research

Jan-19-2025, 11:20:04 GMT–Neural Information Processing Systems

We introduce the Cambridge Law Corpus (CLC), a corpus for legal AI research. It consists of over 250 000 court cases from the UK. Most cases are from the 21st century, but the corpus includes cases as old as the 16th century. This paper presents the first release of the corpus, containing the raw text and meta-data. Together with the corpus, we provide annotations on case outcomes for 638 cases, done by legal experts.

cambridge law corpus, dataset, legal ai research

Neural Information Processing Systems

Jan-19-2025, 11:20:04 GMT

Conferences Web Page

Add feedback

Industry:
- Law (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.32)