justice
A Survey and Datasheet Repository of Publicly Available US Criminal Justice Datasets
Criminal justice is an increasingly important application domain for machine learning and algorithmic fairness, as predictive tools are becoming widely used in police, courts, and prison systems worldwide. A few relevant benchmarks have received significant attention, e.g., the COMPAS dataset, often without proper consideration of the domain context. To raise awareness of publicly available criminal justice datasets and encourage their responsible use, we conduct a survey, consider contexts, highlight potential uses, and identify gaps and limitations. We provide datasheets for 15 datasets and upload them to a public repository. We compare the datasets across several dimensions, including size, coverage of the population, and potential use, highlighting concerns. We hope that this work can provide a useful starting point for researchers looking for appropriate datasets related to criminal justice, and that the repository will continue to grow as a community effort.
That's So FETCH: Fashioning Ensemble Techniques for LLM Classification in Civil Legal Intake and Referral
Each year millions of people seek help for their legal problems by calling a legal aid program hotline, walking into a legal aid office, or using a lawyer referral service. The first step to match them to the right help is to identify the legal problem the applicant is experiencing. Misdirection has consequences. Applicants may miss a deadline, experience physical abuse, lose housing or lose custody of children while waiting to connect to the right legal help. We introduce and evaluate the FETCH classifier for legal issue classification and describe two methods for improving accuracy: a hybrid LLM/ML ensemble classification method, and the automatic generation of follow-up questions to enrich the initial problem narrative. We employ a novel data set of 419 real-world queries to a nonprofit lawyer referral service. Ultimately, we show classification accuracy (hits@2) of 97.37\% using a mix of inexpensive models, exceeding the performance of the current state-of-the-art GPT-5 model. Our approach shows promise in significantly reducing the cost of guiding users of the legal system to the right resource for their problem while achieving high accuracy.
- North America > United States > Massachusetts > Suffolk County > Boston (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Oregon (0.06)
- (4 more...)
LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents
Tan, Jinzhe, Benyekhlef, Karim
Access to justice remains a global challenge, with many citizens still finding it difficult to seek help from the justice system when facing legal issues. Although the internet provides abundant legal information and services, navigating complex websites, understanding legal terminology, and filling out procedural forms continue to pose barriers to accessing justice. This paper introduces the LegalWebAgent framework that employs a web agent powered by multimodal large language models to bridge the gap in access to justice for ordinary citizens. The framework combines the natural language understanding capabilities of large language models with multimodal perception, enabling a complete process from user query to concrete action. It operates in three stages: the Ask Module understands user needs through natural language processing; the Browse Module autonomously navigates webpages, interacts with page elements (including forms and calendars), and extracts information from HTML structures and webpage screenshots; the Act Module synthesizes information for users or performs direct actions like form completion and schedule booking. To evaluate its effectiveness, we designed a benchmark test covering 15 real-world tasks, simulating typical legal service processes relevant to Québec civil law users, from problem identification to procedural operations. Evaluation results show LegalWebAgent achieved a peak success rate of 86.7%, with an average of 84.4% across all tested models, demonstrating high autonomy in complex real-world scenarios.
- North America > Canada > Quebec > Montreal (0.05)
- North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.04)
- North America > Canada > Alberta > Census Division No. 11 > Sturgeon County (0.04)
The age of unipolar diplomacy is coming to an end
What is a Palestinian without olives? In Gaza, the world has seen the cost of a diplomacy that claims to uphold a rules-based order but applies it selectively. The United States intervened late, and only to defend an occupation the International Court of Justice (ICJ) has ruled illegal. Alongside other Western nations that built multilateral institutions, the US increasingly pursues nationalist agendas that undermine them. The hypocrisy is stark: one set of rules for Ukraine, another for Gaza.
- North America > United States (0.91)
- Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.52)
- Europe > Ukraine (0.25)
- (11 more...)
- Government (1.00)
- Law > International Law (0.90)
Can LLMs Create Legally Relevant Summaries and Analyses of Videos?
Hoeben-Kuil, Lyra, van Dijck, Gijs, Savelka, Jaromir, Gunawan, Johanna, Kollnig, Konrad, Kolacz, Marta, Duffourc, Mindy, Chakravarthy, Shashank, Westermann, Hannes
Understanding the legally relevant factual basis of an event and conveying it through text is a key skill of legal professionals. This skill is important for preparing forms (e.g., insurance claims) or other legal documents (e.g., court claims), but often presents a challenge for laypeople. Current AI approaches aim to bridge this gap, but mostly rely on the user to articulate what has happened in text, which may be challenging for many. Here, we investigate the capability of large language models (LLMs) to understand and summarize events occurring in videos. We ask an LLM to summarize and draft legal letters, based on 120 YouTube videos showing legal issues in various domains. Overall, 71.7\% of the summaries were rated as of high or medium quality, which is a promising result, opening the door to a number of applications in e.g. access to justice.
- North America > Canada (0.14)
- Europe > Netherlands > Limburg > Maastricht (0.05)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > Virginia (0.04)
- (8 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (0.68)
- Law > Criminal Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
How the Supreme Court Defines Liberty
Recent memoirs by the Justices reveal how a new vision of restraint has led to radical outcomes. To understand how grudging Amy Coney Barrett's new book is when it comes to revealing personal details, consider that one of the family members the Supreme Court Justice most often refers to is a great-grandmother who died five years before she was born. On Barrett's desk at home, she recounts in " Listening to the Law," she keeps a photograph of her great-grandmother's one-story house, where, as a widow during the Great Depression, she raised some of her thirteen children and took in other needy relatives. "Looking at the photo reminds me of a woman who stretched herself beyond all reasonable capacity," Barrett explains. "I'm not sure that I'll be able to manage my life with the same grace that she had. But she motivates me to keep trying." For Barrett, the mother of seven children, that effort entails setting her alarm for 5 "Our kids get up at six thirty during the school year, so I start early if I want to accomplish anything on my own to-do list," she writes. This is what passes for disclosure from Barrett; she measures out the details of her life with coffee spoons, careful not to spill.
- North America > Haiti (0.14)
- North America > United States > New York (0.05)
- North America > United States > California (0.05)
- (9 more...)
- Law > Government & the Courts (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Education (1.00)
BengaliMoralBench: A Benchmark for Auditing Moral Reasoning in Large Language Models within Bengali Language and Culture
Ridoy, Shahriyar Zaman, Wasi, Azmine Toushik, Tonmoy, Koushik Ahamed
As multilingual Large Language Models (LLMs) gain traction across South Asia, their alignment with local ethical norms, particularly for Bengali, which is spoken by over 285 million people and ranked 6th globally, remains underexplored. Existing ethics benchmarks are largely English-centric and shaped by Western frameworks, overlooking cultural nuances critical for real-world deployment. To address this, we introduce BengaliMoralBench, the first large-scale ethics benchmark for the Bengali language and socio-cultural contexts. It covers five moral domains, Daily Activities, Habits, Parenting, Family Relationships, and Religious Activities, subdivided into 50 culturally relevant subtopics. Each scenario is annotated via native-speaker consensus using three ethical lenses: Virtue, Commonsense, and Justice ethics. We conduct systematic zero-shot evaluation of prominent multilingual LLMs, including Llama, Gemma, Qwen, and DeepSeek, using a unified prompting protocol and standard metrics. Performance varies widely (50-91% accuracy), with qualitative analysis revealing consistent weaknesses in cultural grounding, commonsense reasoning, and moral fairness. BengaliMoralBench provides a foundation for responsible localization, enabling culturally aligned evaluation and supporting the deployment of ethically robust AI in diverse, low-resource multilingual settings such as Bangladesh.
- Asia > Bangladesh (0.24)
- North America > United States (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (2 more...)
Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research
Large language models (LLMs) are increasingly utilized by researchers across a wide range of domains, and qualitative social science is no exception; however, this adoption faces persistent challenges, including interpretive bias, low reliability, and weak auditability. We introduce a framework that situates LLM usage along two dimensions, interpretive depth and autonomy, thereby offering a straightforward way to classify LLM applications in qualitative research and to derive practical design recommendations. We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science that use LLMs as a tool and not strictly as the subject of study. Rather than granting models expansive freedom, our approach encourages researchers to decompose tasks into manageable segments, much as they would when delegating work to capable undergraduate research assistants. By maintaining low levels of autonomy and selectively increasing interpretive depth only where warranted and under supervision, one can plausibly reap the benefits of LLMs while preserving transparency and reliability.
- Europe > Austria > Vienna (0.14)
- Africa > Middle East > Egypt (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (12 more...)
- Law (1.00)
- Government (1.00)
- Health & Medicine (0.67)
The Verification-Value Paradox: A Normative Critique of Gen AI in Legal Practice
It is often claimed that machine learning-based generative AI products will drastically streamline and reduce the cost of legal practice. This enthusiasm assumes lawyers can effectively manage AI's risks. Cases in Australia and elsewhere in which lawyers have been reprimanded for submitting inaccurate AI-generated content to courts suggest this paradigm must be revisited. This paper argues that a new paradigm is needed to evaluate AI use in practice, given (a) AI's disconnection from reality and its lack of transparency, and (b) lawyers' paramount duties like honesty, integrity, and not to mislead the court. It presents an alternative model of AI use in practice that more holistically reflects these features (the verification-value paradox). That paradox suggests increases in efficiency from AI use in legal practice will be met by a correspondingly greater imperative to manually verify any outputs of that use, rendering the net value of AI use often negligible to lawyers. The paper then sets out the paradox's implications for legal practice and legal education, including for AI use but also the values that the paradox suggests should undergird legal practice: fidelity to the truth and civic responsibility.
- North America > United States > California (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Oceania > Australia > New South Wales (0.04)
- (18 more...)
- Research Report (1.00)
- Overview (1.00)
- Law > Litigation (1.00)
- Law > Government & the Courts (0.93)
- Education > Educational Setting > Higher Education (0.69)
- (2 more...)