AITopics | Law

Collaborating Authors

Law

Interpretable Deep Regression Models with Interval-Censored Failure Time Data

Yuan, Changhui, Zhao, Shishun, Li, Shuwei, Song, Xinyuan, Chen, Zhao

arXiv.org Machine LearningMar-25-2025

Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing model capabilities, especially in exploring nonlinear covariate effects under right censoring. However, deep learning methods for interval-censored data, where the unobservable failure time is only known to lie in an interval, remain underexplored and limited to specific data type or model. This work proposes a general regression framework for interval-censored data with a broad class of partially linear transformation models, where key covariate effects are modeled parametrically while nonlinear effects of nuisance multi-modal covariates are approximated via DNNs, balancing interpretability and flexibility. We employ sieve maximum likelihood estimation by leveraging monotone splines to approximate the cumulative baseline hazard function. To ensure reliable and tractable estimation, we develop an EM algorithm incorporating stochastic gradient descent. We establish the asymptotic properties of parameter estimators and show that the DNN estimator achieves minimax-optimal convergence. Extensive simulations demonstrate superior estimation and prediction accuracy over state-of-the-art methods. Applying our method to the Alzheimer's Disease Neuroimaging Initiative dataset yields novel insights and improved predictive performance compared to traditional approaches.

artificial intelligence, interpretable deep regression model, machine learning, (3 more...)

arXiv.org Machine Learning

2503.19763

Genre: Research Report (0.69)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.53)

Add feedback

Video games can't escape their role in the radicalisation of young men Keith Stuart

The GuardianMar-24-2025, 06:48:00 GMT

There is a lot of attention on young men and toxic masculinity at the moment. The devastating Netflix drama Adolescence, about a 13-year-old boy accused of murdering a girl after being radicalised by the online manosphere, has drawn attention to the problem through the sheer force of its brilliant writing and a blistering lead performance from teenager Owen Cooper. Recently, former England football manager Gareth Southgate gave a speech about the state of boyhood in the UK, specifically about how young men, lacking moral mentors, are turning to gambling and video gaming, thereby disconnecting from society and immersing themselves in predominantly male online communities where misogyny and racism are often rife. There has been some kickback in the gaming press to the idea that games have provided a less-than-ideal environment for boys, but even those of us who have played and enjoyed games all our lives need to face up to the fact that gaming forums, message boards, streaming platforms and social media groups are awash with disturbing hate speech and violent rhetoric. Honestly, we have known this for a while.

artificial intelligence, radicalisation, social media, (7 more...)

The Guardian

Country: Europe > United Kingdom > England (0.25)

Industry:

Media (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Law Enforcement & Public Safety (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Evidencing Unauthorized Training Data from AI Generated Content using Information Isotopes

Tao, Qi, Jinhua, Yin, Dongqi, Cai, Yueqi, Xie, Huili, Wang, Zhiyang, Hu, Peiru, Yang, Guoshun, Nan, Zhili, Zhou, Shangguang, Wang, Lingjuan, Lyu, Yongfeng, Huang, Nicholas, Lane

arXiv.org Artificial IntelligenceMar-24-2025

In light of scaling laws, many AI institutions are intensifying efforts to construct advanced AIs on extensive collections of high-quality human data. However, in a rush to stay competitive, some institutions may inadvertently or even deliberately include unauthorized data (like privacy- or intellectual property-sensitive content) for AI training, which infringes on the rights of data owners. Compounding this issue, these advanced AI services are typically built on opaque cloud platforms, which restricts access to internal information during AI training and inference, leaving only the generated outputs available for forensics. Thus, despite the introduction of legal frameworks by various countries to safeguard data rights, uncovering evidence of data misuse in modern opaque AI applications remains a significant challenge. In this paper, inspired by the ability of isotopes to trace elements within chemical reactions, we introduce the concept of information isotopes and elucidate their properties in tracing training data within opaque AI systems. Furthermore, we propose an information isotope tracing method designed to identify and provide evidence of unauthorized data usage by detecting the presence of target information isotopes in AI generations. We conduct experiments on ten AI models (including GPT-4o, Claude-3.5, and DeepSeek) and four benchmark datasets in critical domains (medical data, copyrighted books, and news). Results show that our method can distinguish training datasets from non-training datasets with 99\% accuracy and significant evidence (p-value$<0.001$) by examining a data entry equivalent in length to a research paper. The findings show the potential of our work as an inclusive tool for empowering individuals, including those without expertise in AI, to safeguard their data rights in the rapidly evolving era of AI advancements and applications.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.208

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves

Qin, Wenjuan, Wang, Weiran, Yang, Yuming, Gui, Tao

arXiv.org Artificial IntelligenceMar-24-2025

The study investigates the efficacy of pre-trained language models (PLMs) in analyzing argumentative moves in a longitudinal learner corpus. Prior studies on argumentative moves often rely on qualitative analysis and manual coding, limiting their efficiency and generalizability. The study aims to: 1) to assess the reliability of PLMs in analyzing argumentative moves; 2) to utilize PLM-generated annotations to illustrate developmental patterns and predict writing quality. A longitudinal corpus of 1643 argumentative texts from 235 English learners in China is collected and annotated into six move types: claim, data, counter-claim, counter-data, rebuttal, and non-argument. The corpus is divided into training, validation, and application sets annotated by human experts and PLMs. We use BERT as one of the implementations of PLMs. The results indicate a robust reliability of PLMs in analyzing argumentative moves, with an overall F1 score of 0.743, surpassing existing models in the field. Additionally, PLM-labeled argumentative moves effectively capture developmental patterns and predict writing quality. Over time, students exhibit an increase in the use of data and counter-claims and a decrease in non-argument moves. While low-quality texts are characterized by a predominant use of claims and data supporting only oneside position, mid- and high-quality texts demonstrate an integrative perspective with a higher ratio of counter-claims, counter-data, and rebuttals. This study underscores the transformative potential of integrating artificial intelligence into language education, enhancing the efficiency and accuracy of evaluating students' writing. The successful application of PLMs can catalyze the development of educational technology, promoting a more data-driven and personalized learning environment that supports diverse educational needs.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.19279

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Palau (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.66)
Education > Educational Setting > K-12 Education > Secondary School (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Evaluating Bias in LLMs for Job-Resume Matching: Gender, Race, and Education

Iso, Hayate, Pezeshkpour, Pouya, Bhutani, Nikita, Hruschka, Estevam

arXiv.org Artificial IntelligenceMar-24-2025

Large Language Models (LLMs) offer the potential to automate hiring by matching job descriptions with candidate resumes, streamlining recruitment processes, and reducing operational costs. However, biases inherent in these models may lead to unfair hiring practices, reinforcing societal prejudices and undermining workplace diversity. This study examines the performance and fairness of LLMs in job-resume matching tasks within the English language and U.S. context. It evaluates how factors such as gender, race, and educational background influence model decisions, providing critical insights into the fairness and reliability of LLMs in HR applications. Our findings indicate that while recent models have reduced biases related to explicit attributes like gender and race, implicit biases concerning educational background remain significant. These results highlight the need for ongoing evaluation and the development of advanced bias mitigation strategies to ensure equitable hiring practices when using LLMs in industry settings.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.19182

Country:

Oceania > Palau (0.04)
North America > United States > North Carolina (0.04)
North America > United States > Michigan (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law > Labor & Employment Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Statistically Testing Training Data for Unwanted Error Patterns using Rule-Oriented Regression

Rass, Stefan, Dallinger, Martin

arXiv.org Artificial IntelligenceMar-24-2025

Artificial intelligence models trained from data can only be as good as the underlying data is. Biases in training data propagating through to the output of a machine learning model are a well-documented and well-understood phenomenon, but the machinery to prevent these undesired effects is much less developed. Efforts to ensure data is clean during collection, such as using bias-aware sampling, are most effective when the entity controlling data collection also trains the AI. In cases where the data is already available, how do we find out if the data was already manipulated, i.e., ``poisoned'', so that an undesired behavior would be trained into a machine learning model? This is a challenge fundamentally different to (just) improving approximation accuracy or efficiency, and we provide a method to test training data for flaws, to establish a trustworthy ground-truth for a subsequent training of machine learning models (of any kind). Unlike the well-studied problem of approximating data using fuzzy rules that are generated from the data, our method hinges on a prior definition of rules to happen before seeing the data to be tested. Therefore, the proposed method can also discover hidden error patterns, which may also have substantial influence. Our approach extends the abilities of conventional statistical testing by letting the ``test-condition'' be any Boolean condition to describe a pattern in the data, whose presence we wish to determine. The method puts fuzzy inference into a regression model, to get the best of the two: explainability from fuzzy logic with statistical properties and diagnostics from the regression, and finally also being applicable to ``small data'', hence not requiring large datasets as deep learning methods do. We provide an open source implementation for demonstration and experiments.

artificial intelligence, expert system, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.18497

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Austria > Upper Austria > Linz (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain

Hu, Yiran, Liu, Huanghai, Chen, Qingjing, Zheng, Ning, Wang, Chong, Liu, Yun, Clarke, Charles L. A., Shen, Weixing

arXiv.org Artificial IntelligenceMar-24-2025

As the scale and capabilities of Large Language Models (LLMs) increase, their applications in knowledge-intensive fields such as legal domain have garnered widespread attention. However, it remains doubtful whether these LLMs make judgments based on domain knowledge for reasoning. If LLMs base their judgments solely on specific words or patterns, rather than on the underlying logic of the language, the ''LLM-as-judges'' paradigm poses substantial risks in the real-world applications. To address this question, we propose a method of legal knowledge injection attacks for robustness testing, thereby inferring whether LLMs have learned legal knowledge and reasoning logic. In this paper, we propose J&H: an evaluation framework for detecting the robustness of LLMs under knowledge injection attacks in the legal domain. The aim of the framework is to explore whether LLMs perform deductive reasoning when accomplishing legal tasks. To further this aim, we have attacked each part of the reasoning logic underlying these tasks (major premise, minor premise, and conclusion generation). We have collected mistakes that legal experts might make in judicial decisions in the real world, such as typos, legal synonyms, inaccurate external legal statutes retrieval. However, in real legal practice, legal experts tend to overlook these mistakes and make judgments based on logic. However, when faced with these errors, LLMs are likely to be misled by typographical errors and may not utilize logic in their judgments. We conducted knowledge injection attacks on existing general and domain-specific LLMs. Current LLMs are not robust against the attacks employed in our experiments. In addition we propose and compare several methods to enhance the knowledge robustness of LLMs.

crime, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.1836

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Law > Criminal Law (0.47)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Agent-based Modeling meets the Capability Approach for Human Development: Simulating Homelessness Policy-making

Aguilera, Alba, Osman, Nardine, Curto, Georgina

arXiv.org Artificial IntelligenceMar-24-2025

The global rise in homelessness calls for urgent and alternative policy solutions. Non-profits and governmental organizations alert about the many challenges faced by people experiencing homelessness (PEH), which include not only the lack of shelter but also the lack of opportunities for personal development. In this context, the capability approach (CA), which underpins the United Nations Sustainable Development Goals (SDGs), provides a comprehensive framework to assess inequity in terms of real opportunities. This paper explores how the CA can be combined with agent-based modelling and reinforcement learning. The goals are: (1) implementing the CA as a Markov Decision Process (MDP), (2) building on such MDP to develop a rich decision-making model that accounts for more complex motivators of behaviour, such as values and needs, and (3) developing an agent-based simulation framework that allows to assess alternative policies aiming to expand or restore people's capabilities. The framework is developed in a real case study of health inequity and homelessness, working in collaboration with stakeholders, non-profits and domain experts. The ultimate goal of the project is to develop a novel agent-based simulation framework, rooted in the CA, which can be replicated in a diversity of social contexts to assess policies in a non-invasive way.

agent, artificial intelligence, choice factor, (17 more...)

arXiv.org Artificial Intelligence

2503.18389

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(13 more...)

Genre: Research Report (0.82)

Industry:

Government (1.00)
Energy (1.00)
Health & Medicine > Consumer Health (0.93)
Law > Statutes (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

Gao, Junyuan, Song, Jiahe, Wu, Jiang, Zhu, Runchuan, Shen, Guanlin, Wang, Shasha, Wei, Xingjian, Yang, Haote, Zhang, Songyang, Li, Weijia, Wang, Bin, Lin, Dahua, Wu, Lijun, He, Conghui

arXiv.org Artificial IntelligenceMar-24-2025

Existing multilingual benchmarks for Large Vision Language Models (LVLMs) suffer from limitations including language-specific content biases, disjointed multimodal input formats, and a lack of safety evaluation. To address these gaps, we propose PM4Bench, the first Parallel Multilingual Multi-Modal Multi-task Benchmark for LVLMs. PM4Bench features a parallel corpus design across 10 languages, enabling fair and accurate cross-lingual comparisons. It includes the vision setting where text and queries are embedded in images, requiring LVLMs to simultaneously "see", "read", and "think", aligning with real-world applications. Additionally, PM\textsuperscript{4}Bench incorporates safety evaluations, addressing critical oversight in existing multilingual benchmarks. Using PM4Bench, we evaluate 11 mainstream LVLMs, revealing significant cross-linguistic performance disparities, particularly in vision settings, and identifying OCR capability as a key determinant of these imbalances. We will release PM4Bench at https://github.com/opendatalab/PM4Bench .

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.18484

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Law (0.93)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Manipulation and the AI Act: Large Language Model Chatbots and the Danger of Mirrors

Krook, Joshua

arXiv.org Artificial IntelligenceMar-24-2025

Large Language Model chatbots are increasingly taking the form and visage of human beings, adapting human faces, names, voices, personalities, and quirks, including those of celebrities and well-known political figures. Personifying AI chatbots could foreseeably increase their trust with users. However, it could also make them more capable of manipulation, by creating the illusion of a close and intimate relationship with an artificial entity. The European Commission has finalized the AI Act, with the EU Parliament making amendments banning manipulative and deceptive AI systems that cause significant harm to users. Although the AI Act covers harms that accumulate over time, it is unlikely to prevent harms associated with prolonged discussions with AI chatbots. Specifically, a chatbot could reinforce a person's negative emotional state over weeks, months, or years through negative feedback loops, prolonged conversations, or harmful recommendations, contributing to a user's deteriorating mental health.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.18387

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England (0.04)
Europe > Serbia > Central Serbia > Belgrade (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Media (1.00)
Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback