AITopics | Stix, Charlotte

Collaborating Authors

Stix, Charlotte

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities

Pistillo, Matteo, Stix, Charlotte

arXiv.org Artificial IntelligenceDec-13-2024

There is a growing consensus that information is the "lifeblood of good governance" (Kolt et al., 2024) and that information sharing should be one of the "natural initial target[s]" of AI governance (Bommasani et al., 2024). Up-to-date and reliable information about AI systems' capabilities and how capabilities will develop in the future can help developers, governments, and researchers advance safety evaluations (Frontier Model Forum, 2024), develop best practices (UK DSIT, 2023), and respond effectively to the new risks posed by frontier AI (Kolt et al., 2024). Information sharing also supports regulatory visibility (Anderljung et al., 2023) and can thus enable better-informed AI governance (O'Brien et al., 2024). Further, access to knowledge about AI systems' potential risks allows AI systems claims to be scrutinized more effectively (Brundage et al., 2020). By contrast, information asymmetries could lead regulators to miscalibrated over-regulation--or under-regulation--of AI (Ball & Kokotajlo, 2024) and could contribute to the "pacing problem," a situation in which government oversight consistently lags behind technology development (Marchant et al., 2011). In short, there is a strong case for information sharing being one "key to making AI go well" (Ball & Kokotajlo, 2024). The Frontier AI Safety Commitments ("FAISC") are an important step towards more comprehensive information sharing by AI developers.

information, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.02512

Country:

Europe > United Kingdom (1.00)
Asia (0.68)
North America > United States > California (0.28)

Genre:

Overview (0.68)
Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Towards evaluations-based safety cases for AI scheming

Balesni, Mikita, Hobbhahn, Marius, Lindner, David, Meinke, Alexander, Korbak, Tomek, Clymer, Joshua, Shlegeris, Buck, Scheurer, Jérémy, Stix, Charlotte, Shah, Rusheb, Goldowsky-Dill, Nicholas, Braun, Dan, Chughtai, Bilal, Evans, Owain, Kokotajlo, Daniel, Bushnaq, Lucius

arXiv.org Artificial IntelligenceNov-7-2024

We sketch how developers of frontier AI systems could construct a structured rationale -- a 'safety case' -- that an AI system is unlikely to cause catastrophic outcomes through scheming. Scheming is a potential threat model where AI systems could pursue misaligned goals covertly, hiding their true capabilities and objectives. In this report, we propose three arguments that safety cases could use in relation to scheming. For each argument we sketch how evidence could be gathered from empirical evaluations, and what assumptions would need to be met to provide strong assurance. First, developers of frontier AI systems could argue that AI systems are not capable of scheming (Scheming Inability). Second, one could argue that AI systems are not capable of posing harm through scheming (Harm Inability). Third, one could argue that control measures around the AI systems would prevent unacceptable outcomes even if the AI systems intentionally attempted to subvert them (Harm Control). Additionally, we discuss how safety cases might be supported by evidence that an AI system is reasonably aligned with its developers (Alignment). Finally, we point out that many of the assumptions required to make these safety arguments have not been confidently satisfied to date and require making progress on multiple open research problems.

ai system, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2411.03336

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback