Goto

Collaborating Authors

 Law


When is using AI the rational choice? The importance of counterfactuals in AI deployment decisions

arXiv.org Artificial Intelligence

Decisions to deploy AI capabilities are often driven by counterfactuals - a comparison of decisions made using AI to decisions that would have been made if the AI were not used. Counterfactual misses, which are poor decisions that are attributable to using AI, may have disproportionate disutility to AI deployment decision makers. Counterfactual hits, which are good decisions attributable to AI usage, may provide little benefit beyond the benefit of better decisions. This paper explores how to include counterfactual outcomes into usage decision expected utility assessments. Several properties emerge when counterfactuals are explicitly included. First, there are many contexts where the expected utility of AI usage is positive for intended beneficiaries and strongly negative for stakeholders and deployment decision makers. Second, high levels of complementarity, where differing AI and user assessments are merged beneficially, often leads to substantial disutility for stakeholders. Third, apparently small changes in how users interact with an AI capability can substantially impact stakeholder utility. Fourth, cognitive biases such as expert overconfidence and hindsight bias exacerbate the perceived frequency of costly counterfactual misses. The expected utility assessment approach presented here is intended to help AI developers and deployment decision makers to navigate the subtle but substantial impact of counterfactuals so as to better ensure that beneficial AI capabilities are used.


Constructing the Truth: Text Mining and Linguistic Networks in Public Hearings of Case 03 of the Special Jurisdiction for Peace (JEP)

arXiv.org Artificial Intelligence

Case 03 of the Special Jurisdiction for Peace (JEP), focused on the so-called false positives in Colombia, represents one of the most harrowing episodes of the Colombian armed conflict. This article proposes an innovative methodology based on natural language analysis and semantic co-occurrence models to explore, systematize, and visualize narrative patterns present in the public hearings of victims and appearing parties. By constructing skipgram networks and analyzing their modularity, the study identifies thematic clusters that reveal regional and procedural status differences, providing empirical evidence on dynamics of victimization, responsibility, and acknowledgment in this case. This computational approach contributes to the collective construction of both judicial and extrajudicial truth, offering replicable tools for other transitional justice cases. The work is grounded in the pillars of truth, justice, reparation, and non-repetition, proposing a critical and in-depth reading of contested memories.


Do Chinese models speak Chinese languages?

arXiv.org Artificial Intelligence

The release of top-performing open-weight LLMs has cemented China's role as a leading force in AI development. Do these models support languages spoken in China? Or do they speak the same languages as Western models? Comparing multilingual capabilities is important for two reasons. First, language ability provides insights into pre-training data curation, and thus into resource allocation and development priorities. Second, China has a long history of explicit language policy, varying between inclusivity of minority languages and a Mandarin-first policy. To test whether Chinese LLMs today reflect an agenda about China's languages, we test performance of Chinese and Western open-source LLMs on Asian regional and Chinese minority languages. Our experiments on Information Parity and reading comprehension show Chinese models' performance across these languages correlates strongly (r=0.93) with Western models', with the sole exception being better Mandarin. Sometimes, Chinese models cannot identify languages spoken by Chinese minorities such as Kazakh and Uyghur, even though they are good at French and German. These results provide a window into current development priorities, suggest options for future development, and indicate guidance for end users.


UK is going full Minority Report with 'murder prediction' research

Engadget

The Guardian reported that the UK's Ministry of Justice has been developing an algorithm designed to identify people who could become killers. Initially dubbed the "homicide prediction project," this tool used data from UK police forces, possibly including victims and witnesses as well as suspects. Civil liberty watchdog Statewatch discovered the program through Freedom of Information Act requests. Based on the documents acquired by the group, Statewatch claimed that the program developed its prediction tool based on police data about between 100,000 and 500,000 people. Different categories of information shared with the Ministry of Justice appeared to also cover sensitive topics such as mental health, addiction, suicide and disability.


ChatGPT's Studio Ghibli-style images show its creative power – but raise new copyright problems

AIHub

Social media has recently been flooded with images that look like they belong in a Studio Ghibli film. Selfies, family photos and even memes have been re-imagined with the soft pastel palette characteristic of the Japanese animation company founded by Hayao Miyazaki. The update significantly improved ChatGPT's image generation capabilities, allowing users to create convincing Ghibli-style images in mere seconds. It has been enormously popular – so much so, in fact, that the system crashed due to user demand. Generative artificial intelligence (AI) systems such as ChatGPT are best understood as "style engines".


Deep Fair Learning: A Unified Framework for Fine-tuning Representations with Sufficient Networks

arXiv.org Machine Learning

Ensuring fairness in machine learning is a critical and challenging task, as biased data representations often lead to unfair predictions. To address this, we propose Deep Fair Learning, a framework that integrates nonlinear sufficient dimension reduction with deep learning to construct fair and informative representations. By introducing a novel penalty term during fine-tuning, our method enforces conditional independence between sensitive attributes and learned representations, addressing bias at its source while preserving predictive performance. Unlike prior methods, it supports diverse sensitive attributes, including continuous, discrete, binary, or multi-group types. Experiments on various types of data structure show that our approach achieves a superior balance between fairness and utility, significantly outperforming state-of-the-art baselines.


'Sound of Freedom' producer says AI tools helped nab child trafficker that eluded FBI for 10 years

FOX News

Editor's Note: This article contains discussions related to child sexual abuse and pornography. Child predators are on high alert as organizations around the globe have begun rolling out artificial intelligence (AI) tools to bring sex traffickers to justice and rescue young victims, according to "Sound of Freedom" executive producer Paul Hutchinson. Hutchinson, who has led 70 undercover rescue missions across 15 countries, told Fox News Digital on Wednesday that he has worked with "black hat" hackers to help identify child predators and bring them to justice. These guys are some of the best hackers anywhere. Some of them do highly illegal things for the right reasons, right?


A Consequentialist Critique of Binary Classification Evaluation Practices

arXiv.org Machine Learning

ML-supported decisions, such as ordering tests or determining preventive custody, often involve binary classification based on probabilistic forecasts. Evaluation frameworks for such forecasts typically consider whether to prioritize independent-decision metrics (e.g., Accuracy) or top-K metrics (e.g., Precision@K), and whether to focus on fixed thresholds or threshold-agnostic measures like AUC-ROC. We highlight that a consequentialist perspective, long advocated by decision theorists, should naturally favor evaluations that support independent decisions using a mixture of thresholds given their prevalence, such as Brier scores and Log loss. However, our empirical analysis reveals a strong preference for top-K metrics or fixed thresholds in evaluations at major conferences like ICML, FAccT, and CHIL. To address this gap, we use this decision-theoretic framework to map evaluation metrics to their optimal use cases, along with a Python package, briertools, to promote the broader adoption of Brier scores. In doing so, we also uncover new theoretical connections, including a reconciliation between the Brier Score and Decision Curve Analysis, which clarifies and responds to a longstanding critique by (Assel, et al. 2017) regarding the clinical utility of proper scoring rules.


The Man Out to Prove How Dumb AI Still Is

The Atlantic - Technology

They want to build AI models that achieve "artificial general intelligence," or AGI--matching or exceeding the capabilities of the human mind. The difference between these two men is that Altman has suggested that his company, OpenAI, has practically built the technology already. Chollet, a French computer scientist and one of the industry's sharpest skeptics, has said that notion is "absolutely clown shoes." When I spoke with him earlier this year, Chollet told me that AI companies have long been "intellectually lazy" in suggesting that their machines are on the path to a kind of supreme knowledge. At this point, those claims are based largely on the programs' ability to pass specific tests (such as the LSAT, Advanced Placement Biology, and even an introductory sommelier exam).


US feds say AI-generated prompt outputs can't be copyrighted

PCWorld

If you use an AI image or text generator to make a work of "art," does it belong to you? That's a huge question hanging over the heads of anyone tempted to use AI tools for commercial products. Crucially, simply plugging prompts into an AI image generator or text generator does NOT meet this burden. Because the author (or artist, or other relevant creative term) of a work is defined as "the person who translates an idea into a fixed, tangible expression," an AI system cannot meet this burden, even though it's using input from a human to generate its output. Commenting on established case law, the report says that "…the Supreme Court has made clear that originality is required, not just time and effort."