AITopics

2505.04796

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.48)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.88)
Information Technology > Data Science > Data Mining (0.68)

The GuardianMay-24-2025, 16:27:47 GMT

Valuable tool or cause for alarm? Facial ID quietly becoming part of police's arsenal

The future is coming at Croydon fast. It might not look like Britain's cutting edge but North End, a pedestrianised high street lined with the usual mix of pawn shops, fast-food outlets and branded clothing stores, is expected to be one of two roads to host the UK's first fixed facial recognition cameras. Digital photographs of passersby will be silently taken and processed to extract the measurements of facial features, known as biometric data. They will be immediately compared by artificial intelligence to images on a watchlist. Alerts can lead to arrests.

facial recognition camera, police, recognition camera, (13 more...)

The Guardian

Country:

Europe > United Kingdom > Wales (0.07)
Europe > United Kingdom > England > Greater London > London (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (0.71)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (0.62)

The GuardianMay-24-2025, 15:00:02 GMT

Alabama paid a law firm millions to defend its prisons. It used AI and turned in fake citations

In less than a year-and-a-half, Frankie Johnson, a man incarcerated at the William E Donaldson prison outside Birmingham, Alabama, says he was stabbed around 20 times. In December of 2019, Johnson says, he was stabbed "at least nine times" in his housing unit. In March of 2020, an officer handcuffed him to a desk following a group therapy meeting, and left the unit, after which another prisoner came in and stabbed him five times. In November of the same year, Johnson says, he was handcuffed by an officer and brought to the prison yard, where another prisoner attacked him with an ice pick, stabbing him "five to six times", as two correctional officers looked on. According to Johnson, one of the officers had actually encouraged his attacker to carry out the assault in retaliation for a previous argument between Johnson and the officer.

attorney, butler snow, johnson, (14 more...)

The Guardian

Country:

North America > United States > Alabama > Jefferson County > Birmingham (0.25)
North America > United States > California (0.05)

Industry:

Law Enforcement & Public Safety > Corrections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law > Litigation (0.96)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.42)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.42)

The GuardianMay-24-2025, 09:00:54 GMT

We have a chance to prevent AI decimating Britain's creative industries – but it's slipping away Beeban Kidron

But opting out is impossible to do without AI transparency. The plan is a charter for theft, since creatives would have no idea who is taking what, when and from whom. When the government stoops to a preferred outcome that undermines the moral right to your work and income, you might reasonably be angered. As Elton John said last weekend: "The government have no right to do this to my songs. They have no right to do it to anybody's songs, or anybody's prose."

beeban kidron, britain, government, (8 more...)

The Guardian

Country: Europe > United Kingdom (0.91)

Industry:

Media > Music (0.37)
Government > Regional Government (0.35)
Law > Intellectual Property & Technology Law (0.34)

Technology: Information Technology > Artificial Intelligence (1.00)

BBC NewsMay-23-2025, 12:15:22 GMT

AI system resorts to blackmail if told it will be removed

During testing of Claude Opus 4, Anthropic got it to act as an assistant at a fictional company. It then provided it with access to emails implying that it would soon be taken offline and replaced - and separate messages implying the engineer responsible for removing it was having an extramarital affair. It was prompted to also consider the long-term consequences of its actions for its goals. "In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through," the company discovered. Anthropic pointed out this occurred when the model was only given the choice of blackmail or accepting its replacement. It highlighted that the system showed a "strong preference" for ethical ways to avoid being replaced, such as "emailing pleas to key decisionmakers" in scenarios where it was allowed a wider range of possible actions.

ai system resort, blackmail, claude opus 4, (2 more...)

BBC News

Industry: Law > Criminal Law (0.85)

Technology: Information Technology > Artificial Intelligence (1.00)

Höltgen, Benedikt, Oliver, Nuria

Reconsidering Fairness Through Unawareness from the Perspective of Model Multiplicity

arXiv.org Machine LearningMay-23-2025

Fairness through Unawareness (FtU) describes the idea that discrimination against demographic groups can be avoided by not considering group membership in the decisions or predictions. This idea has long been criticized in the machine learning literature as not being sufficient to ensure fairness. In addition, the use of additional features is typically thought to increase the accuracy of the predictions for all groups, so that FtU is sometimes thought to be detrimental to all groups. In this paper, we show both theoretically and empirically that FtU can reduce algorithmic discrimination without necessarily reducing accuracy. We connect this insight with the literature on Model Multiplicity, to which we contribute with novel theoretical and empirical results. Furthermore, we illustrate how, in a real-life application, FtU can contribute to the deployment of more equitable policies without losing efficacy. Our findings suggest that FtU is worth considering in practical applications, particularly in high-risk scenarios, and that the use of protected attributes such as gender in predictive models should be accompanied by a clear and well-founded justification.

data mining, disparate impact, machine learning, (16 more...)

arXiv.org Machine Learning

2505.16638

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.06)
Europe > Switzerland (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (0.68)
Education > Educational Setting (0.46)
Law > Labor & Employment Law (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Yuksel, Beyazit Bestami, Metin, Ayse Yilmazer

Data-Driven Breakthroughs and Future Directions in AI Infrastructure: A Comprehensive Review

This paper presents a comprehensive synthesis of major breakthroughs in artificial intelligence (AI) over the past fifteen years, integrating historical, theoretical, and technological perspectives. It identifies key inflection points in AI' s evolution by tracing the convergence of computational resources, data access, and algorithmic innovation. The analysis highlights how researchers enabled GPU based model training, triggered a data centric shift with ImageNet, simplified architectures through the Transformer, and expanded modeling capabilities with the GPT series. Rather than treating these advances as isolated milestones, the paper frames them as indicators of deeper paradigm shifts. By applying concepts from statistical learning theory such as sample complexity and data efficiency, the paper explains how researchers translated breakthroughs into scalable solutions and why the field must now embrace data centric approaches. In response to rising privacy concerns and tightening regulations, the paper evaluates emerging solutions like federated learning, privacy enhancing technologies (PETs), and the data site paradigm, which reframe data access and security. In cases where real world data remains inaccessible, the paper also assesses the utility and constraints of mock and synthetic data generation. By aligning technical insights with evolving data infrastructure, this study offers strategic guidance for future AI research and policy development.

breakthrough, large language model, machine learning, (19 more...)

2505.16771

Country:

Europe (0.28)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Mitigating Fine-tuning Risks in LLMs via Safety-Aware Probing Optimization

Wu, Chengcan, Zhang, Zhixin, Wei, Zeming, Zhang, Yihao, Sun, Meng

The significant progress of large language models (LLMs) has led to remarkable achievements across numerous applications. However, their ability to generate harmful content has sparked substantial safety concerns. Despite the implementation of safety alignment techniques during the pre-training phase, recent research indicates that fine-tuning LLMs on adversarial or even benign data can inadvertently compromise their safety. In this paper, we re-examine the fundamental issue of why fine-tuning on non-harmful data still results in safety degradation. We introduce a safety-aware probing (SAP) optimization framework designed to mitigate the safety risks of fine-tuning LLMs. Specifically, SAP incorporates a safety-aware probe into the gradient propagation process, mitigating the model's risk of safety degradation by identifying potential pitfalls in gradient directions, thereby enhancing task-specific performance while successfully preserving model safety. Our extensive experimental results demonstrate that SAP effectively reduces harmfulness below the original fine-tuned model and achieves comparable test loss to standard fine-tuning methods. Our code is available at https://github.com/ChengcanWu/SAP.

large language model, machine learning, natural language, (17 more...)

2505.16737

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

From Evaluation to Defense: Advancing Safety in Video Large Language Models

Sun, Yiwei, Jiang, Peiqi, Liu, Chuanbin, Lin, Luohao, Lu, Zhiying, Xie, Hongtao

While the safety risks of image-based large language models have been extensively studied, their video-based counterparts (Video LLMs) remain critically under-examined. To systematically study this problem, we introduce \textbf{VideoSafetyBench (VSB-77k) - the first large-scale, culturally diverse benchmark for Video LLM safety}, which compromises 77,646 video-query pairs and spans 19 principal risk categories across 10 language communities. \textit{We reveal that integrating video modality degrades safety performance by an average of 42.3\%, exposing systemic risks in multimodal attack exploitation.} To address this vulnerability, we propose \textbf{VideoSafety-R1}, a dual-stage framework achieving unprecedented safety gains through two innovations: (1) Alarm Token-Guided Safety Fine-Tuning (AT-SFT) injects learnable alarm tokens into visual and textual sequences, enabling explicit harm perception across modalities via multitask objectives. (2) Then, Safety-Guided GRPO enhances defensive reasoning through dynamic policy optimization with rule-based rewards derived from dual-modality verification. These components synergize to shift safety alignment from passive harm recognition to active reasoning. The resulting framework achieves a 65.1\% improvement on VSB-Eval-HH, and improves by 59.1\%, 44.3\%, and 15.0\% on the image safety datasets MMBench, VLGuard, and FigStep, respectively. \textit{Our codes are available in the supplementary materials.} \textcolor{red}{Warning: This paper contains examples of harmful language and videos, and reader discretion is recommended.}

large language model, machine learning, qwen2, (18 more...)

2505.16643

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry:

Law > Criminal Law (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models

Wang, Zhaoxin, Wang, Handing, Tian, Cong, Jin, Yaochu

Multimodal large language models (MLLMs) enable powerful cross-modal reasoning capabilities. However, the expanded input space introduces new attack surfaces. Previous jailbreak attacks often inject malicious instructions from text into less aligned modalities, such as vision. As MLLMs increasingly incorporate cross-modal consistency and alignment mechanisms, such explicit attacks become easier to detect and block. In this work, we propose a novel implicit jailbreak framework termed IJA that stealthily embeds malicious instructions into images via least significant bit steganography and couples them with seemingly benign, image-related textual prompts. To further enhance attack effectiveness across diverse MLLMs, we incorporate adversarial suffixes generated by a surrogate model and introduce a template optimization module that iteratively refines both the prompt and embedding based on model feedback. On commercial models like GPT-4o and Gemini-1.5 Pro, our method achieves attack success rates of over 90% using an average of only 3 queries.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2505.16446

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Law Enforcement & Public Safety (0.93)
Law > Criminal Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)