Case-Based Reasoning
A Modal Logic for Temporal and Jurisdictional Classifier Models
Di Florio, Cecilia, Dong, Huimin, Rotolo, Antonino
Logic-based models can be used to build verification tools for machine learning classifiers employed in the legal field. ML classifiers predict the outcomes of new cases based on previous ones, thereby performing a form of case-based reasoning (CBR). In this paper, we introduce a modal logic of classifiers designed to formally capture legal CBR. We incorporate principles for resolving conflicts between precedents, by introducing into the logic the temporal dimension of cases and the hierarchy of courts within the legal system.
- North America > Canada > Manitoba (0.04)
- Europe > United Kingdom > Wales (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation
Kim, Chaeeun, Lee, Jinu, Hwang, Wonseok
Legal Case Retrieval (LCR), which retrieves relevant cases from a query case, is a fundamental task for legal professionals in research and decision-making. However, existing studies on LCR face two major limitations. First, they are evaluated on relatively small-scale retrieval corpora (e.g., 100-55K cases) and use a narrow range of criminal query types, which cannot sufficiently reflect the complexity of real-world legal retrieval scenarios. Second, their reliance on embedding-based or lexical matching methods often results in limited representations and legally irrelevant matches. To address these issues, we present: (1) LEGAR BENCH, the first large-scale Korean LCR benchmark, covering 411 diverse crime types in queries over 1.2M candidate cases; and (2) LegalSearchLM, a retrieval model that performs legal element reasoning over the query case and directly generates content containing those elements, grounded in the target cases through constrained decoding. Experimental results show that LegalSearchLM outperforms baselines by 6-20% on LEGAR BENCH, achieving state-of-the-art performance. It also demonstrates strong generalization to out-of-domain cases, outperforming naive generative models trained on in-domain data by 15%.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- (4 more...)
- Law > Criminal Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.85)
Scalable Disk-Based Approximate Nearest Neighbor Search with Page-Aligned Graph
Kang, Dingyi, Jiang, Dongming, Yang, Hanshen, Liu, Hang, Li, Bingzhe
Approximate Nearest Neighbor Search (ANNS), as the core of vector databases (VectorDBs), has become widely used in modern AI and ML systems, powering applications from information retrieval to bio-informatics. While graph-based ANNS methods achieve high query efficiency, their scalability is constrained by the available host memory. Recent disk-based ANNS approaches mitigate memory usage by offloading data to Solid-State Drives (SSDs). However, they still suffer from issues such as long I/O traversal path, misalignment with storage I/O granularity, and high in-memory indexing overhead, leading to significant I/O latency and ultimately limiting scalability for large-scale vector search. In this paper, we propose PageANN, a disk-based approximate nearest neighbor search (ANNS) framework designed for high performance and scalability. PageANN introduces a page-node graph structure that aligns logical graph nodes with physical SSD pages, thereby shortening I/O traversal paths and reducing I/O operations. Specifically, similar vectors are clustered into page nodes, and a co-designed disk data layout leverages this structure with a merging technique to store only representative vectors and topology information, avoiding unnecessary reads. To further improve efficiency, we design a memory management strategy that combines lightweight indexing with coordinated memory-disk data allocation, maximizing host memory utilization while minimizing query latency and storage overhead. Experimental results show that PageANN significantly outperforms state-of-the-art (SOTA) disk-based ANNS methods, achieving 1.85x-10.83x higher throughput and 51.7%-91.9% lower latency across different datasets and memory budgets, while maintaining comparable high recall accuracy.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > Canada (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes the Latent Case Model (LCM), a Bayesian approach to clustering in which clusters are represented by a prototype (a specific sample from the data) and feature subspaces (a binary subset of the variables signifying those features that are relevant to the class). The approach is presented as being a Bayesian, trainable version of the Case-Based Reasoning approach popular in AI, and is motivated by the ways such models have proved highly effective in explaining human decision making. The generative model (Figure 1) represents each item as coming from a mixture of S clusters, where each cluster is represented by a prototype and subspace (as above) and a function \phi which generates features matching those of the prototype with high probability for features in the subspace, and uniform features outside it. The model is thus similar in functionality to LDA but quite different in terms of its representation.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.75)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.60)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (0.85)
Object-Centric Case-Based Reasoning via Argumentation
Gaul, Gabriel de Olim, Gould, Adam, Kori, Avinash, Toni, Francesca
We introduce Slot Attention Argumentation for Case-Based Reasoning (SAA-CBR), a novel neuro-symbolic pipeline for image classification that integrates object-centric learning via a neural Slot Attention (SA) component with symbolic reasoning conducted by Abstract Argumentation for Case-Based Reasoning (AA-CBR). We explore novel integrations of AA-CBR with the neural component, including feature combination strategies, casebase reduction via representative samples, novel count-based partial orders, a One-Vs-Rest strategy for extending AA-CBR to multi-class classification, and an application of Supported AA-CBR, a bipolar variant of AA-CBR. We demonstrate that SAA-CBR is an effective classifier on the CLEVR-Hans datasets, showing competitive performance against baseline models.
- Europe > United Kingdom > England > Greater London > London (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
ALARB: An Arabic Legal Argument Reasoning Benchmark
Shairah, Harethah Abu, AlHarbi, Somayah, AlHussein, Abdulaziz, Alsabea, Sameer, Shaqaqi, Omar, AlShamlan, Hebah, Knio, Omar, Turkiyyah, George
We introduce ALARB, a dataset and suite of tasks designed to evaluate the reasoning capabilities of large language models (LLMs) within the Arabic legal domain. While existing Arabic benchmarks cover some knowledge-intensive tasks such as retrieval and understanding, substantial datasets focusing specifically on multistep reasoning for Arabic LLMs, especially in open-ended contexts, are lacking. The dataset comprises over 13K commercial court cases from Saudi Arabia, with each case including the facts presented, the reasoning of the court, the verdict, as well as the cited clauses extracted from the regulatory documents. We define a set of challenging tasks leveraging this dataset and reflecting the complexity of real-world legal reasoning, including verdict prediction, completion of reasoning chains in multistep legal arguments, and identification of relevant regulations based on case facts. We benchmark a representative selection of current open and closed Arabic LLMs on these tasks and demonstrate the dataset's utility for instruction tuning. Notably, we show that instruction-tuning a modest 12B parameter model using ALARB significantly enhances its performance in verdict prediction and Arabic verdict generation, reaching a level comparable to that of GPT-4o.
- Asia > Middle East > Saudi Arabia (0.49)
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)