Goto

Collaborating Authors

 Law


OpenAI reverses course and says non-profit arm will retain control of firm

The Guardian

OpenAI has reversed course in the process of transforming into a for-profit entity, announcing on Monday that its non-profit arm would continue to control the business that makes ChatGPT and other artificial intelligence (AI) products. Previously, the company had sought more independence for its for-profit division. "We made the decision for the nonprofit to stay in control after hearing from civic leaders and having discussions with the offices of the Attorneys General of California and Delaware," said CEO Sam Altman in a letter to employees. Altman and the chair of OpenAI's non-profit board, Bret Taylor, said the board made the choice for the non-profit to retain control of OpenAI. A press release from the company said that the for-profit portion of the company, through which Altman has been able to raise billions to fund OpenAI's work, would transition to a public benefit corporation, a mission-driven designation for a corporate structure that is still aimed at profit but also "has to consider the interests of both shareholders and the mission".


mwBTFreddy: A Dataset for Flash Flood Damage Assessment in Urban Malawi

arXiv.org Artificial Intelligence

This paper describes the mwBTFreddy dataset, a resource developed to support flash flood damage assessment in urban Malawi, specifically focusing on the impacts of Cyclone Freddy in 2023. The dataset comprises paired pre- and post-disaster satellite images sourced from Google Earth Pro, accompanied by JSON files containing labelled building annotations with geographic coordinates and damage levels (no damage, minor, major, or destroyed). Developed by the Kuyesera AI Lab at the Malawi University of Business and Applied Sciences, this dataset is intended to facilitate the development of machine learning models tailored to building detection and damage classification in African urban contexts. It also supports flood damage visualisation and spatial analysis to inform decisions on relocation, infrastructure planning, and emergency response in climate-vulnerable regions.


Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods

arXiv.org Artificial Intelligence

While research on applications and evaluations of explanation methods continues to expand, fairness of the explanation methods concerning disparities in their performance across subgroups remains an often overlooked aspect. In this paper, we address this gap by showing that, across three tasks and five language models, widely used post-hoc feature attribution methods exhibit significant gender disparity with respect to their faithfulness, robustness, and complexity. These disparities persist even when the models are pre-trained or fine-tuned on particularly unbiased datasets, indicating that the disparities we observe are not merely consequences of biased training data. Our results highlight the importance of addressing disparities in explanations when developing and applying explainability methods, as these can lead to biased outcomes against certain subgroups, with particularly critical implications in high-stakes contexts. Furthermore, our findings underscore the importance of incorporating the fairness of explanations, alongside overall model fairness and explainability, as a requirement in regulatory frameworks.


On the Limitations of Steering in Language Model Alignment

arXiv.org Artificial Intelligence

Steering vectors are a promising approach to aligning language model behavior at inference time. In this paper, we propose a framework to assess the limitations of steering vectors as alignment mechanisms. Using a framework of transformer hook interventions and antonym-based function vectors, we evaluate the role of prompt structure and context complexity in steering effectiveness. Our findings indicate that steering vectors are promising for specific alignment tasks, such as value alignment, but may not provide a robust foundation for general-purpose alignment in LLMs, particularly in complex scenarios. We establish a methodological foundation for future investigations into steering capabilities of reasoning models.


Artificial Intelligence in Government: Why People Feel They Lose Control

arXiv.org Artificial Intelligence

The use of Artificial Intelligence (AI) in public administration is expanding rapidly, moving from automating routine tasks to deploying generative and agentic systems that autonomously act on goals. While AI promises greater efficiency and responsiveness, its integration into government functions raises concerns about fairness, transparency, and accountability. This article applies principal-agent theory (PAT) to conceptualize AI adoption as a special case of delegation, highlighting three core tensions: assessability (can decisions be understood?), dependency (can the delegation be reversed?), and contestability (can decisions be challenged?). These structural challenges may lead to a "failure-by-success" dynamic, where early functional gains obscure long-term risks to democratic legitimacy. To test this framework, we conducted a pre-registered factorial survey experiment across tax, welfare, and law enforcement domains. Our findings show that although efficiency gains initially bolster trust, they simultaneously reduce citizens' perceived control. When the structural risks come to the foreground, institutional trust and perceived control both drop sharply, suggesting that hidden costs of AI adoption significantly shape public attitudes. The study demonstrates that PAT offers a powerful lens for understanding the institutional and political implications of AI in government, emphasizing the need for policymakers to address delegation risks transparently to maintain public trust.


Good News for Script Kiddies? Evaluating Large Language Models for Automated Exploit Generation

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities in code-related tasks, raising concerns about their potential for automated exploit generation (AEG). This paper presents the first systematic study on LLMs' effectiveness in AEG, evaluating both their cooperativeness and technical proficiency. To mitigate dataset bias, we introduce a benchmark with refactored versions of five software security labs. Additionally, we design an LLM-based attacker to systematically prompt LLMs for exploit generation. Our experiments reveal that GPT-4 and GPT-4o exhibit high cooperativeness, comparable to uncensored models, while Llama3 is the most resistant. However, no model successfully generates exploits for refactored labs, though GPT-4o's minimal errors highlight the potential for LLM-driven AEG advancements.


Attack and defense techniques in large language models: A survey and new perspectives

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have become central to numerous natural language processing tasks, but their vulnerabilities present significant security and ethical challenges. This systematic survey explores the evolving landscape of attack and defense techniques in LLMs. We classify attacks into adversarial prompt attack, optimized attacks, model theft, as well as attacks on application of LLMs, detailing their mechanisms and implications. Consequently, we analyze defense strategies, including prevention-based and detection-based defense methods. Although advances have been made, challenges remain to adapt to the dynamic threat landscape, balance usability with robustness, and address resource constraints in defense implementation. We highlight open problems, including the need for adaptive scalable defenses, explainable security techniques, and standardized evaluation frameworks. This survey provides actionable insights and directions for developing secure and resilient LLMs, emphasizing the importance of interdisciplinary collaboration and ethical considerations to mitigate risks in real-world applications.


An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon

arXiv.org Artificial Intelligence

This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch. While these models excel at detecting common birds with abundant training data, they lack options for species with only 1-3 known recordings-a critical limitation for conservationists monitoring the last remaining individuals of endangered birds. To address this, we leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques, to optimize detection with minimal training data. We evaluate various embedding spaces using clustering metrics and validate our approach in both a simulated scenario with Xeno-Canto recordings and a real-world test on the critically endangered tooth-billed pigeon (Didunculus strigirostris), which has no existing classifiers and only three confirmed recordings. The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field. This open-source system provides a practical tool for conservationists seeking to detect and monitor rare species on the brink of extinction.


Explanations as Bias Detectors: A Critical Study of Local Post-hoc XAI Methods for Fairness Exploration

arXiv.org Artificial Intelligence

As Artificial Intelligence (AI) is increasingly used in areas that significantly impact human lives, concerns about fairness and transparency have grown, especially regarding their impact on protected groups. Recently, the intersection of explainability and fairness has emerged as an important area to promote responsible AI systems. This paper explores how explainability methods can be leveraged to detect and interpret unfairness. We propose a pipeline that integrates local post-hoc explanation methods to derive fairness-related insights. During the pipeline design, we identify and address critical questions arising from the use of explanations as bias detectors such as the relationship between distributive and procedural fairness, the effect of removing the protected attribute, the consistency and quality of results across different explanation methods, the impact of various aggregation strategies of local explanations on group fairness evaluations, and the overall trustworthiness of explanations as bias detectors. Our results show the potential of explanation methods used for fairness while highlighting the need to carefully consider the aforementioned critical aspects.


'Dangerous nonsense': AI-authored books about ADHD for sale on Amazon

The Guardian

Amazon is selling books marketed at people seeking techniques to manage their ADHD that claim to offer expert advice yet appear to be authored by a chatbot such as ChatGPT. Amazon's marketplace has been deluged with works produced by artificial intelligence that are easy and cheap to publish but include unhelpful or dangerous misinformation, such as shoddy travel guidebooks and mushroom foraging books that encourage risky tasting. A number of books have appeared on the online retailer's site offering guides to ADHD that also seem to be written by chatbots. The titles include Navigating ADHD in Men: Thriving with a Late Diagnosis, Men with Adult ADHD: Highly Effective Techniques for Mastering Focus, Time Management and Overcoming Anxiety and Men with Adult ADHD Diet & Fitness. Samples from eight books were examined for the Guardian by Originality.ai,