misconduct
Met investigates hundreds of officers after using Palantir AI tool
The Met said corruption was the most consistent offence detected, with misconduct related to'abuse of the IT system that rosters shifts by police officers for personal or financial gain'. The Met said corruption was the most consistent offence detected, with misconduct related to'abuse of the IT system that rosters shifts by police officers for personal or financial gain'. Sat 25 Apr 2026 11.34 EDTFirst published on Sat 25 Apr 2026 11.31 EDT The Metropolitan police have launched investigations into hundreds of officers after using an AI tool built by the controversial tech company Palantir to root out rogue cops. The software was deployed by the Met over the course of a week, surveilling staff members using data the force has ready access to, unearthing rule-breaking ranging from work-from-home violations to suspected corruption and even criminal allegations such as rape. The Met said as a result of the software, evidence had been found tying a small number of officers to serious cases of misconduct and criminality, resulting in the arrest of three officers for offences including abuse of authority for sexual purposes, fraud, sexual assault, misconduct in public office and misuse of police systems.
- Asia > Singapore (0.29)
- Europe > Austria > Vienna (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (7 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (0.92)
- (4 more...)
Thinking Machines Cofounder's Office Relationship Preceded His Termination
Leaders at Mira Murati's startup believe Barret Zoph engaged in an incident of "serious misconduct." The details are now coming to light. Leaders at Mira Murati's Thinking Machines Lab confronted the startup's cofounder and former CTO, Barret Zoph, over an alleged relationship with another employee last summer, WIRED has learned. That relationship was likely the alleged "misconduct" that has been mentioned in prior reporting, including by WIRED . To protect the privacy of the individuals involved, WIRED is not naming the employee in question.
- South America > Venezuela (0.50)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- Europe > Slovakia (0.05)
- (2 more...)
- Asia > Singapore (0.29)
- Europe > Austria > Vienna (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (7 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators
Tan, Leanne, Chua, Gabriel, Ge, Ziyu, Lee, Roy Ka-Wei
Modern moderation systems increasingly support multiple languages, but often fail to address localisation and low-resource variants - creating safety gaps in real-world deployments. Small models offer a potential alternative to large LLMs, yet still demand considerable data and compute. We present LionGuard 2, a lightweight, multilingual moderation classifier tailored to the Singapore context, supporting English, Chinese, Malay, and partial Tamil. Built on pre-trained OpenAI embeddings and a multi-head ordinal classifier, LionGuard 2 outperforms several commercial and open-source systems across 17 benchmarks, including both Singapore-specific and public English datasets. The system is actively deployed within the Singapore Government, demonstrating practical efficacy at scale. Our findings show that high-quality local data and robust multilingual embeddings can achieve strong moderation performance, without fine-tuning large models. We release our model weights and part of our training data to support future work on LLM safety.
BMDetect: A Multimodal Deep Learning Framework for Comprehensive Biomedical Misconduct Detection
Zhou, Yize, Zhang, Jie, Wang, Meijie, Yu, Lun
Academic misconduct detection in biomedical research remains challenging due to algorithmic narrowness in existing methods and fragmented analytical pipelines. We present BMDetect, a multimodal deep learning framework that integrates journal metadata (SJR, institutional data), semantic embeddings (PubMedBERT), and GPT-4o-mined textual attributes (methodological statistics, data anomalies) for holistic manuscript evaluation. Key innovations include: (1) multimodal fusion of domain-specific features to reduce detection bias; (2) quantitative evaluation of feature importance, identifying journal authority metrics (e.g., SJR-index) and textual anomalies (e.g., statistical outliers) as dominant predictors; and (3) the BioMCD dataset, a large-scale benchmark with 13,160 retracted articles and 53,411 controls. BMDetect achieves 74.33% AUC, outperforming single-modality baselines by 8.6%, and demonstrates transferability across biomedical subfields. This work advances scalable, interpretable tools for safeguarding research integrity.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Education (1.00)
- Health & Medicine > Therapeutic Area (0.68)
RabakBench: Scaling Human Annotations to Construct Localized Multilingual Safety Benchmarks for Low-Resource Languages
Chua, Gabriel, Tan, Leanne, Ge, Ziyu, Lee, Roy Ka-Wei
Large language models (LLMs) and their safety classifiers often perform poorly on low-resource languages due to limited training data and evaluation benchmarks. This paper introduces RabakBench, a new multilingual safety benchmark localized to Singapore's unique linguistic context, covering Singlish, Chinese, Malay, and Tamil. RabakBench is constructed through a scalable three-stage pipeline: (i) Generate - adversarial example generation by augmenting real Singlish web content with LLM-driven red teaming; (ii) Label - semi-automated multi-label safety annotation using majority-voted LLM labelers aligned with human judgments; and (iii) Translate - high-fidelity translation preserving linguistic nuance and toxicity across languages. The final dataset comprises over 5,000 safety-labeled examples across four languages and six fine-grained safety categories with severity levels. Evaluations of 11 popular open-source and closed-source guardrail classifiers reveal significant performance degradation. RabakBench not only enables robust safety evaluation in Southeast Asian multilingual settings but also offers a reproducible framework for building localized safety datasets in low-resource environments. The benchmark dataset, including the human-verified translations, and evaluation code are publicly available.
- Asia > Singapore (0.49)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (9 more...)
- Law (1.00)
- Information Technology (0.93)
- Health & Medicine (0.70)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.47)
AUTOLAW: Enhancing Legal Compliance in Large Language Models via Case Law Generation and Jury-Inspired Deliberation
Nguyen, Tai D., Pham, Long H., Sun, Jun
The rapid advancement of domain-specific large language models (LLMs) in fields like law necessitates frameworks that account for nuanced regional legal distinctions, which are critical for ensuring compliance and trustworthiness. Existing legal evaluation benchmarks often lack adaptability and fail to address diverse local contexts, limiting their utility in dynamically evolving regulatory landscapes. To address these gaps, we propose AutoLaw, a novel violation detection framework that combines adversarial data generation with a jury-inspired deliberation process to enhance legal compliance of LLMs. Unlike static approaches, AutoLaw dynamically synthesizes case law to reflect local regulations and employs a pool of LLM-based "jurors" to simulate judicial decision-making. Jurors are ranked and selected based on synthesized legal expertise, enabling a deliberation process that minimizes bias and improves detection accuracy. Evaluations across three benchmarks: Law-SG, Case-SG (legality), and Unfair-TOS (policy), demonstrate AutoLaw's effectiveness: adversarial data generation improves LLM discrimination, while the jury-based voting strategy significantly boosts violation detection rates. Our results highlight the framework's ability to adaptively probe legal misalignments and deliver reliable, context-aware judgments, offering a scalable solution for evaluating and enhancing LLMs in legally sensitive applications.
- Law > Statutes (0.66)
- Leisure & Entertainment > Sports > Football (0.46)
Revealed: Thousands of UK university students caught cheating using AI
Thousands of university students in the UK have been caught misusing ChatGPT and other artificial intelligence tools in recent years, while traditional forms of plagiarism show a marked decline, a Guardian investigation can reveal. A survey of academic integrity violations found almost 7,000 proven cases of cheating using AI tools in 2023-24, equivalent to 5.1 for every 1,000 students. That was up from 1.6 cases per 1,000 in 2022-23. Figures up to May suggest that number will increase again this year to about 7.5 proven cases per 1,000 students – but recorded cases represent only the tip of the iceberg, according to experts. The data highlights a rapidly evolving challenge for universities: trying to adapt assessment methods to the advent of technologies such as ChatGPT and other AI-powered writing tools.
- Europe > United Kingdom (0.25)
- North America > United States (0.05)
- North America > Canada (0.05)
- Information Technology > Artificial Intelligence > Applied AI (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.58)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.58)
AI can be a powerful tool for scientists. But it can also fuel research misconduct
In February this year, Google announced it was launching "a new AI system for scientists". It said this system was a collaborative tool designed to help scientists "in creating novel hypotheses and research plans". It's too early to tell just how useful this particular tool will be to scientists. But what is clear is that artificial intelligence (AI) more generally is already transforming science. Last year for example, computer scientists won the Nobel Prize for Chemistry for developing an AI model to predict the shape of every protein known to mankind.
- Personal > Honors (0.91)
- Research Report (0.75)