Goto

Collaborating Authors

 Law


They wanted to save us from a dark AI future. Then six people were killed

The Guardian

Years before she became the peculiar central thread linking a double homicide in Pennsylvania, the fatal shooting of a federal agent in Vermont and the murder of an elderly landlord in California, a computer programmer bought a sailboat. The programmer was known to friends, foes and followers as Ziz. She had come to the San Francisco Bay Area in 2016 as part of an influx of young people arriving to study the dangers that artificial intelligence could pose to humanity. In one of the most expensive regions of the United States, however, it is difficult to save the world when you can't make rent. So she bought a boat for 600 and moored it next to a friend's vessel in a marina. For five years, she used it as an occasional, cramped bunk. In her waking hours, she worked on a blog of provocative and increasingly extreme ideas about confrontation and retaliation. At night, she fell asleep as the boat rocked back and forth, drifting with the flotsam of greater Silicon Valley. Then, on the night of 19 August 2022, her sister and a friend reported that they saw her fall overboard. The Coast Guard and local authorities scrambled boats and aircraft. After a nearly 30-hour search, neither Ziz nor her body could be found. A newspaper in Alaska, where she was born, published a short obituary referring to her by her birth name: "Jack Amadeus LaSota left our lives but not our hearts on Aug 19 after a boating accident. Loving adventure, friends and family, music, blueberries, biking, computer games and animals, you are missed." Ziz's ideas did not die in the waters of the California coast. She had faked her drowning and gone underground, before being arrested last month in western Maryland and charged with trespassing and illegal transportation of a firearm. The targets of Ziz's ire, who include some of Silicon Valley's most prominent intellectuals, have taken security precautions. "Ziz is not stupid," someone familiar with her, who asked to remain anonymous, told me. "This is a very smart person – both smart and crazy." Ziz's writing had polarized members of a niche but influential movement of AI theorists and tech bloggers who call themselves the "rationalists". The movement is less about specific ideas than it is about an ethos – applying rigorous, mathematically informed thinking to AI, philosophy, psychology and the big questions of our time. Rationalists are odd, though often charming, people. They tend to be fantasy and sci-fi geeks, use lots of jargon and think intensely about things other people barely think about at all.


Improving Neutral Point of View Text Generation through Parameter-Efficient Reinforcement Learning and a Small-Scale High-Quality Dataset

arXiv.org Artificial Intelligence

This paper describes the construction of a dataset and the evaluation of training methods to improve generative large language models' (LLMs) ability to answer queries on sensitive topics with a Neutral Point of View (NPOV), i.e., to provide significantly more informative, diverse and impartial answers. The dataset, the SHQ-NPOV dataset, comprises 300 high-quality, human-written quadruplets: a query on a sensitive topic, an answer, an NPOV rating, and a set of links to source texts elaborating the various points of view. The first key contribution of this paper is a new methodology to create such datasets through iterative rounds of human peer-critique and annotator training, which we release alongside the dataset. The second key contribution is the identification of a highly effective training regime for parameter-efficient reinforcement learning (PE-RL) to improve NPOV generation. We compare and extensively evaluate PE-RL and multiple baselines-including LoRA finetuning (a strong baseline), SFT and RLHF. PE-RL not only improves on overall NPOV quality compared to the strongest baseline ($97.06\%\rightarrow 99.08\%$), but also scores much higher on features linguists identify as key to separating good answers from the best answers ($60.25\%\rightarrow 85.21\%$ for presence of supportive details, $68.74\%\rightarrow 91.43\%$ for absence of oversimplification). A qualitative analysis corroborates this. Finally, our evaluation finds no statistical differences between results on topics that appear in the training dataset and those on separated evaluation topics, which provides strong evidence that our approach to training PE-RL exhibits very effective out of topic generalization.


Introduction to Artificial Consciousness: History, Current Trends and Ethical Challenges

arXiv.org Artificial Intelligence

With the significant progress of artificial intelligence (AI) and consciousness science, artificial consciousness (AC) has recently gained popularity. This work provides a broad overview of the main topics and current trends in AC. The first part traces the history of this interdisciplinary field to establish context and clarify key terminology, including the distinction between Weak and Strong AC. The second part examines major trends in AC implementations, emphasising the synergy between Global Workspace and Attention Schema, as well as the problem of evaluating the internal states of artificial systems. The third part analyses the ethical dimension of AC development, revealing both critical risks and transformative opportunities. The last part offers recommendations to guide AC research responsibly, and outlines the limitations of this study as well as avenues for future research. The main conclusion is that while AC appears both indispensable and inevitable for scientific progress, serious efforts are required to address the far-reaching impact of this innovative research path.


Improving LLM Safety Alignment with Dual-Objective Optimization

arXiv.org Artificial Intelligence

Existing training-time safety alignment techniques for large language models (LLMs) remain vulnerable to jailbreak attacks. Direct preference optimization (DPO), a widely deployed alignment method, exhibits limitations in both experimental and theoretical contexts as its loss function proves suboptimal for refusal learning. Through gradient-based analysis, we identify these shortcomings and propose an improved safety alignment that disentangles DPO objectives into two components: (1) robust refusal training, which encourages refusal even when partial unsafe generations are produced, and (2) targeted unlearning of harmful knowledge. This approach significantly increases LLM robustness against a wide range of jailbreak attacks, including prefilling, suffix, and multi-turn attacks across both in-distribution and out-of-distribution scenarios. Furthermore, we introduce a method to emphasize critical refusal tokens by incorporating a reward-based token-level weighting mechanism for refusal learning, which further improves the robustness against adversarial exploits. Our research also suggests that robustness to jailbreak attacks is correlated with token distribution shifts in the training process and internal representations of refusal and harmful tokens, offering valuable directions for future research in LLM safety alignment. The code is available at https://github.com/wicai24/DOOR-Alignment


Taxation Perspectives from Large Language Models: A Case Study on Additional Tax Penalties

arXiv.org Artificial Intelligence

How capable are large language models (LLMs) in the domain of taxation? Although numerous studies have explored the legal domain in general, research dedicated to taxation remain scarce. Moreover, the datasets used in these studies are either simplified, failing to reflect the real-world complexities, or unavailable as open source. To address this gap, we introduce PLAT, a new benchmark designed to assess the ability of LLMs to predict the legitimacy of additional tax penalties. PLAT is constructed to evaluate LLMs' understanding of tax law, particularly in cases where resolving the issue requires more than just applying related statutes. Our experiments with six LLMs reveal that their baseline capabilities are limited, especially when dealing with conflicting issues that demand a comprehensive understanding. However, we found that enabling retrieval, self-reasoning, and discussion among multiple agents with specific role assignments, this limitation can be mitigated.


LexGenie: Automated Generation of Structured Reports for European Court of Human Rights Case Law

arXiv.org Artificial Intelligence

Analyzing large volumes of case law to uncover evolving legal principles, across multiple cases, on a given topic is a demanding task for legal professionals. Structured topical reports provide an effective solution by summarizing key issues, principles, and judgments, enabling comprehensive legal analysis on a particular topic. While prior works have advanced query-based individual case summarization, none have extended to automatically generating multi-case structured reports. To address this, we introduce LexGenie, an automated LLM-based pipeline designed to create structured reports using the entire body of case law on user-specified topics within the European Court of Human Rights jurisdiction. LexGenie retrieves, clusters, and organizes relevant passages by topic to generate a structured outline and cohesive content for each section. Expert evaluation confirms LexGenie's utility in producing structured reports that enhance efficient, scalable legal analysis.


Designing Speech Technologies for Australian Aboriginal English: Opportunities, Risks and Participation

arXiv.org Artificial Intelligence

In Australia, post-contact language varieties, including creoles and local varieties of international languages, emerged as a result of forced contact between Indigenous communities and English speakers. These contact varieties are widely used, yet are poorly supported by language technologies. This gap presents barriers to participation in civil and economic society for Indigenous communities using these varieties, and reproduces minoritisation of contemporary Indigenous sociolinguistic identities. This paper concerns three questions regarding this context. First, can speech technologies support speakers of Australian Aboriginal English, a local indigenised variety of English? Second, what risks are inherent in such a project? Third, what technology development practices are appropriate for this context, and how can researchers integrate meaningful community participation in order to mitigate risks? We argue that opportunities do exist -- as well as risks -- and demonstrate this through a case study exploring design practices in a real-world project aiming to improve speech technologies for Australian Aboriginal English. We discuss how we integrated culturally appropriate and participatory processes throughout the project. We call for increased support for languages used by Indigenous communities, including contact varieties, which provide practical economic and socio-cultural benefits, provided that participatory and culturally safe practices are enacted.


FairSense-AI: Responsible AI Meets Sustainability

arXiv.org Artificial Intelligence

In this paper, we introduce FairSense-AI: a multimodal framework designed to detect and mitigate bias in both text and images. By leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs), FairSense-AI uncovers subtle forms of prejudice or stereotyping that can appear in content, providing users with bias scores, explanatory highlights, and automated recommendations for fairness enhancements. In addition, FairSense-AI integrates an AI risk assessment component that aligns with frameworks like the MIT AI Risk Repository and NIST AI Risk Management Framework, enabling structured identification of ethical and safety concerns. The platform is optimized for energy efficiency via techniques such as model pruning and mixed-precision computation, thereby reducing its environmental footprint. Through a series of case studies and applications, we demonstrate how FairSense-AI promotes responsible AI use by addressing both the social dimension of fairness and the pressing need for sustainability in large-scale AI deployments. https://vectorinstitute.github.io/FairSense-AI, https://pypi.org/project/fair-sense-ai/ (Sustainability , Responsible AI , Large Language Models , Vision Language Models , Ethical AI , Green AI)


Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models

arXiv.org Artificial Intelligence

The rapid growth of Large Language Models (LLMs) presents significant privacy, security, and ethical concerns. While much research has proposed methods for defending LLM systems against misuse by malicious actors, researchers have recently complemented these efforts with an offensive approach that involves red teaming, i.e., proactively attacking LLMs with the purpose of identifying their vulnerabilities. This paper provides a concise and practical overview of the LLM red teaming literature, structured so as to describe a multi-component system end-to-end. To motivate red teaming we survey the initial safety needs of some high-profile LLMs, and then dive into the different components of a red teaming system as well as software packages for implementing them. We cover various attack methods, strategies for attack-success evaluation, metrics for assessing experiment outcomes, as well as a host of other considerations. Our survey will be useful for any reader who wants to rapidly obtain a grasp of the major red teaming concepts for their own use in practical applications.


Federal judge chooses not to sanction lawyer who admitted using AI in mistake-filled brief

FOX News

The filing in question was related to a case in which Guyer's client Karen Iovino claimed she faced retaliation from employer Michael Stapleton Associates, and "was fired for reporting alleged issues about MSA's contract with the State Department to that agency's Office of Inspector General." In an August filing, Guyer denied citing "'fictitious' cases," and said that the cases did in fact exist, but that they were misquoted and miscited by generative AI. "GPTs generate excellent to brilliant legal arguments," Guyer wrote in a separate declaration, saying that the errors were generated by Atrophic Inc.'s Claude 3 Opus, which is one of several AI tools Guyer says he uses. "I utilize a suite of generative AI technologies for legal research and writing purposes, and GPT legal document briefing," Guyer said.