From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification
Xu, Shanshan, Santosh, T. Y. S. S, Ichim, Oana, Risini, Isabella, Plank, Barbara, Grabmair, Matthias
–arXiv.org Artificial Intelligence
In legal NLP, Case Outcome Classification (COC) must not only be accurate but also trustworthy and explainable. Existing work in explainable COC has been limited to annotations by a single expert. However, it is well-known that lawyers may disagree in their assessment of case facts. We hence collect a novel dataset RAVE: Rationale Variation in ECHR1, which is obtained from two experts in the domain of international human rights law, for whom we observe weak agreement. We study their disagreements and build a two-level task-independent taxonomy, supplemented with COC-specific subcategories. To our knowledge, this is the first work in the legal NLP that focuses on human label variation. We quantitatively assess different taxonomy categories and find that disagreements mainly stem from underspecification of the legal context, which poses challenges given the typically limited granularity and noise in COC metadata. We further assess the explainablility of SOTA COC models on RAVE and observe limited agreement between models and experts. Overall, our case study reveals hitherto underappreciated complexities in creating benchmark datasets in legal NLP that revolve around identifying aspects of a case's facts supposedly relevant to its outcome.
arXiv.org Artificial Intelligence
Oct-24-2023
- Country:
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Europe
- Croatia (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Switzerland (0.04)
- North America > United States
- Maryland > Baltimore (0.04)
- Washington > King County
- Seattle (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Law > Civil Rights & Constitutional Law (1.00)
- Technology: