Goto

Collaborating Authors

 core principle


Evaluating LLM Agent Adherence to Hierarchical Safety Principles: A Lightweight Benchmark for Probing Foundational Controllability Components

Potham, Ram

arXiv.org Artificial Intelligence

Credible safety plans for advanced AI development require methods to verify agent behavior and detect potential control deficiencies early. A fundamental aspect is ensuring agents adhere to safety-critical principles, especially when these conflict with operational goals. This paper introduces a lightweight, interpretable benchmark to evaluate an LLM agent's ability to uphold a high-level safety principle when faced with conflicting task instructions. Our evaluation of six LLMs reveals two primary findings: (1) a quantifiable "cost of compliance" where safety constraints degrade task performance even when compliant solutions exist, and (2) an "illusion of compliance" where high adherence often masks task incompetence rather than principled choice. These findings provide initial evidence that while LLMs can be influenced by hierarchical directives, current approaches lack the consistency required for reliable safety governance.


Who Followed the Blueprint? Analyzing the Responses of U.S. Federal Agencies to the Blueprint for an AI Bill of Rights

Lage, Darren, Pruitt, Riley, Arnold, Jason Ross

arXiv.org Artificial Intelligence

This study examines the extent to which U.S. federal agencies responded to and implemented the principles outlined in the White House's October 2022 "Blueprint for an AI Bill of Rights." The Blueprint provided a framework for the ethical governance of artificial intelligence systems, organized around five core principles: safety and effectiveness, protection against algorithmic discrimination, data privacy, notice and explanation about AI systems, and human alternatives and fallback. Through an analysis of publicly available records across 15 federal departments, the authors found limited evidence that the Blueprint directly influenced agency actions after its release. Only five departments explicitly mentioned the Blueprint, while 12 took steps aligned with one or more of its principles. However, much of this work appeared to have precedents predating the Blueprint or motivations disconnected from it, such as compliance with prior executive orders on trustworthy AI. Departments' activities often emphasized priorities like safety, accountability and transparency that overlapped with Blueprint principles, but did not necessarily stem from it. The authors conclude that the non-binding Blueprint seems to have had minimal impact on shaping the U.S. government's approach to ethical AI governance in its first year. Factors like public concerns after high-profile AI releases and obligations to follow direct executive orders likely carried more influence over federal agencies. More rigorous study would be needed to definitively assess the Blueprint's effects within the federal bureaucracy and broader society.


Responsible AI Development: Complying with the "Safety" Core Principle

#artificialintelligence

When it comes to responsible development of AI, there are many (specifically 35) attributes which I have labeled as "core principles" that need to be considered. To be clear, this does not mean that all the core principles are always relevant to every AI application; they are not. But going through all of them can be important for a developer demonstrating compliance with best practices and limiting/disposing of liability. One of the core principles is "safety." Since it is a broad term it is important to clarify how it should be applied.


Details of UK Data Protection Reform Bill, Proposals for AI Regulation Come To Light - CPO Magazine

#artificialintelligence

The details of the United Kingdom's data protection reform plans are solidifying with the release of the first public version of the Data Protection and Digital Information Bill (DPDIB), and the government has accompanied this with a set of new proposals for AI regulation. The new data protection reform bill is the first concrete shape of a new regulatory framework for the country as it breaks off from terms established under the EU's General Data Protection Regulation (GDPR), emerging from a consultation process that ran for nearly a year. The new AI regulation proposals consist of six core principles that attempt to balance consumer and general safety concerns with the needs and wants of the UK's $4.6 billion AI sector. The new data protection reform bill is the next step in the UK's gradual process of breaking entirely with the GDPR in the wake of "Brexit," with the current governing Data Protection Act 2018 largely mirroring those terms. The UK government has expressed a desire to set terms that are more business-friendly, but has to walk a careful path to avoid being considered an "inadequate" data exchange partner by the EU due to lack of GDPR parity.


artificial-intelligence-2

#artificialintelligence

A new paper published by the Government on the 18th July 2018 called'Establishing A Pro-Innovation Approach To Regulating AI' states that the regulation of artificial intelligence in the UK will be underpinned by 6 core principles designed to manage the risks that come with the technology. The six core principles will be applied across all sectors of the economy on a non-statutory basis, complemented by context-specific regulatory guidance and voluntary standards that will be implemented by UK regulators such as the Information Commissioner's Office. Hence, there will be no central AI regulator, but instead sector regulators who will apply the 6 core principles to artificial intelligence systems operated within the area they oversee. Given these proposals, the UK is adopting a far more light-touch risk-based approach compared to the more prescriptive and standardized one being pursued by the EU, which published its draft AI Act back in 2021. The UK approach to artificial intelligence will instead focus upon proportionality, with the regulatory framework for artificial intelligence systems being determined by the industry and context in which the system is being deployed.


Ten principles for ethical AI

#artificialintelligence

If you're taking a long-term approach to artificial intelligence (AI), you're likely thinking about how to make your AI systems ethical. Building ethical AI is the right thing to do. Not only do your corporate values demand it, it's also one of the ideal ways to help minimise risks that range from compliance failures to brand damage. But building ethical AI is hard. The difficulty starts with a question: what is ethical AI?


Safeguarding user interest: 3 core principles of Design for Trust

#artificialintelligence

We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Trust in technology is eroding. This is especially true when it comes to emerging technologies such as AI, machine learning, augmented and virtual reality and the Internet of Things. These technologies are powerful and have the potential for great good. But they are not well understood by end-users of tech and, in some cases, not even by creators of tech.


Operationalizing AI Ethics Principles

Communications of the ACM

Artificial intelligence (AI) has become a part of our everyday lives from healthcare to law enforcement. AI-related ethical challenges have grown apace ranging from algorithmic bias and data privacy to transparency and accountability. As a direct reaction to these growing ethical concerns, organizations have been publishing their AI principles for ethical practice (over 100 sets and increasing). However, the multiplication of these mostly vaguely formulated principles has not proven to be helpful in guiding practice. Only by operationalizing AI principles for ethical practice can we help computer scientists, developers, and designers to spot and think through ethical issues and recognize when a complex ethical issue requires in-depth expert analysis.


An Impact Model of AI on the Principles of Justice: Encompassing the Autonomous Levels of AI Legal Reasoning

Eliot, Lance

arXiv.org Artificial Intelligence

Efforts furthering the advancement of Artificial Intelligence (AI) will increasingly encompass AI Legal Reasoning (AILR) as a crucial element in the practice of law. It is argued in this research paper that the infusion of AI into existing and future legal activities and the judicial structure needs to be undertaken by mindfully observing an alignment with the core principles of justice. As such, the adoption of AI has a profound twofold possibility of either usurping the principles of justice, doing so in a Dystopian manner, and yet also capable to bolster the principles of justice, doing so in a Utopian way. By examining the principles of justice across the Levels of Autonomy (LoA) of AI Legal Reasoning, the case is made that there is an ongoing tension underlying the efforts to develop and deploy AI that can demonstrably determine the impacts and sway upon each core principle of justice and the collective set.


MONTRÉAL.AI Montréal Artificial Intelligence - MONTRÉAL.AI

#artificialintelligence

For the purpose of entrusting all sentient beings with powerful AI tools to learn, deploy and scale AI in order to enhance their prosperity, to settle planetary-scale problems and to inspire those who, with AI, will shape the 21st Century, Montréal.AI introduces the "VIP AI 101 CheatSheet for All". Encompassing all facets of AI, the General Secretariat of MONTREAL.AI presents, with authority and from insider knowledge: "Artificial Intelligence 101: The First World-Class Overview of AI for the General Public". You are qualified for a career in machine learning! Language: Course given in English. "In life, you need forcing functions. You never know what you're capable of until you have no choice but go and do it. Excessive comfort leads to unrealized potential."