AITopics | safety testing

Collaborating Authors

safety testing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AI industry pours millions into politics as lawsuits and feuds mount

The GuardianSep-2-2025, 16:02:32 GMT

A little over two years ago, OpenAI's founder Sam Altman stood in front of lawmakers at a congressional hearing and asked them for stronger regulations on artificial intelligence. The technology was "risky" and "could cause significant harm to the world", Altman said, calling for the creation of a new regulatory agency to address AI safety. Altman and the AI industry are promoting a very different message today. The AI they once framed as an existential threat to humanity is now key to maintaining American prosperity and hegemony. Regulations that were once a necessity are now criticized as a hindrance that will weaken the US and embolden its adversaries.

artificial intelligence, machine learning, natural language, (18 more...)

The Guardian

Country:

North America > United States > California (0.17)
North America > United States > New York (0.06)
North America > United States > New Jersey (0.05)
North America > United States > Illinois (0.05)

Industry:

Law > Litigation (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

The Pentagon is gutting the team that tests AI and weapons systems

MIT Technology ReviewJun-10-2025, 09:00:00 GMT

It is a significant overhaul of a department that in 40 years has never before been placed so squarely on the chopping block. Here's how today's defense tech companies, which have fostered close connections to the Trump administration, stand to gain, and why safety testing might suffer as a result. The Operational Test and Evaluation office is "the last gate before a technology gets to the field," says Missy Cummings, a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. Though the military can do small experiments with new systems without running it by the office, it has to test anything that gets fielded at scale. "In a bipartisan way--up until now--everybody has seen it's working to help reduce waste, fraud, and abuse," she says.

artificial intelligence, pentagon, test ai and weapon system, (6 more...)

MIT Technology Review

Country: North America > United States (0.58)

Industry:

Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.58)

Technology: Information Technology > Artificial Intelligence (0.65)

Add feedback

SCALOFT: An Initial Approach for Situation Coverage-Based Safety Analysis of an Autonomous Aerial Drone in a Mine Environment

Proma, Nawshin Mannan, Hodge, Victoria J, Alexander, Rob

arXiv.org Artificial IntelligenceMay-28-2025

The safety of autonomous systems in dynamic and hazardous environments poses significant challenges. This paper presents a testing approach named SCALOFT for systematically assessing the safety of an autonomous aerial drone in a mine. SCALOFT provides a framework for developing diverse test cases, real-time monitoring of system behaviour, and detection of safety violations. Detected violations are then logged with unique identifiers for detailed analysis and future improvement. SCALOFT helps build a safety argument by monitoring situation coverage and calculating a final coverage measure. We have evaluated the performance of this approach by deliberately introducing seeded faults into the system and assessing whether SCALOFT is able to detect those faults. For a small set of plausible faults, we show that SCALOFT is successful in this.

artificial intelligence, drone, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2505.20969

Country: Europe > Germany (0.15)

Genre: Research Report (0.82)

Industry:

Materials > Metals & Mining (0.49)
Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exclusive: Renowned Experts Pen Support for California's Landmark AI Safety Bill

TIME - TechAug-7-2024, 21:06:13 GMT

On August 7, a group of renowned professors co-authored a letter urging key lawmakers to support a California AI bill as it enters the final stages of the state's legislative process. In a letter shared exclusively with TIME, Yoshua Bengio, Geoffrey Hinton, Lawrence Lessig, and Stuart Russell argue that the next generation of AI systems pose "severe risks" if "developed without sufficient care and oversight," and describe the bill as the "bare minimum for effective regulation of this technology." The bill, titled the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act, was introduced by Senator Scott Wiener in February of this year. It requires AI companies training large-scale models to conduct rigorous safety testing for potentially dangerous capabilities and implement comprehensive safety measures to mitigate risks. "There are fewer regulations on AI systems that could pose catastrophic risks than on sandwich shops or hairdressers," the four experts write.

california, developer, renowned expert pen support, (12 more...)

TIME - Tech

Country:

North America > United States > California (0.66)
Asia > China (0.06)
Europe (0.05)

Industry:

Law > Statutes (0.32)
Government > Regional Government > North America Government > United States Government (0.30)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

OpenAI delays launch of voice assistant, citing safety testing

Washington Post - Technology NewsJun-26-2024, 00:30:59 GMT

OpenAI first added the ability for ChatGPT to speak in a one of several synthetic voices, or "personas," late last year. The demo in May used one of those voices to show off a newer, more capable AI system called GPT-4o that saw the chatbot speak in expressive tones, respond to a person's tone of voice and facial expressions, and have more complex conversations. One of the voices, which OpenAI called Sky, resembles the voice of an AI bot played by Johansson in the 2013 movie "Her," about a lonely man who falls in love with his AI assistant.

openai delay launch, safety testing, voice assistant

Washington Post - Technology News

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

California lawmakers are trying to regulate AI before it's too late. Here's how

Los Angeles TimesJun-19-2024, 10:00:51 GMT

For four years, Jacob Hilton worked for one of the most influential startups in the Bay Area -- OpenAI. His research helped test and improve the truthfulness of AI models such as ChatGPT. He believes artificial intelligence can benefit society, but he also recognizes the serious risks if the technology is left unchecked. Hilton was among 13 current and former OpenAI and Google employees who this month signed an open letter that called for more whistleblower protections, citing broad confidentiality agreements as problematic. "The basic situation is that employees, the people closest to the technology, they're also the ones with the most to lose from being retaliated against for speaking up," says Hilton, 33, now a researcher at the nonprofit Alignment Research Center, who lives in Berkeley.

crabtree-ireland, openai, tech company, (14 more...)

Los Angeles Times

Country:

Europe > Ireland (0.07)
North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Security & Privacy (0.96)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Add feedback

An Approach to Systematic Data Acquisition and Data-Driven Simulation for the Safety Testing of Automated Driving Functions

Eisemann, Leon, Fehling-Kaschek, Mirjam, Gommel, Henrik, Hermann, David, Klemp, Marvin, Lauer, Martin, Lickert, Benjamin, Luettner, Florian, Moss, Robin, Neis, Nicole, Pohle, Maria, Romanski, Simon, Stadler, Daniel, Stolz, Alexander, Ziehn, Jens, Zhou, Jingxing

arXiv.org Artificial IntelligenceMay-2-2024

With growing complexity and criticality of automated driving functions in road traffic and their operational design domains (ODD), there is increasing demand for covering significant proportions of development, validation, and verification in virtual environments and through simulation models. If, however, simulations are meant not only to augment real-world experiments, but to replace them, quantitative approaches are required that measure to what degree and under which preconditions simulation models adequately represent reality, and thus, using their results accordingly. Especially in R&D areas related to the safety impact of the "open world", there is a significant shortage of real-world data to parameterize and/or validate simulations - especially with respect to the behavior of human traffic participants, whom automated driving functions will meet in mixed traffic. We present an approach to systematically acquire data in public traffic by heterogeneous means, transform it into a unified representation, and use it to automatically parameterize traffic behavior models for use in data-driven virtual validation of automated driving functions.

data acquisition and data-driven simulation, systematic data acquisition, vehicle, (8 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ITSC57777.2023.10422676

2405.01776

Country:

Europe > Spain > Basque Country > Biscay Province > Bilbao (0.05)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

OpenAI and Other Tech Giants Will Have to Warn the US Government When They Start New AI Projects

WIREDJan-26-2024, 22:30:48 GMT

When OpenAI's ChatGPT took the world by storm last year, it caught many power brokers in both Silicon Valley and Washington, DC, by surprise. The US government should now get advance warning of future AI breakthroughs involving large language models, the technology behind ChatGPT. The Biden administration is preparing to use the Defense Production Act to compel tech companies to inform the government when they train an AI model using a significant amount of computing power. The rule could take effect as soon as next week. The new requirement will give the US government access to key information about some of the most sensitive projects inside OpenAI, Google, Amazon, and other tech companies competing in AI.

government, large language model, machine learning, (17 more...)

WIRED

Country:

North America > United States > District of Columbia > Washington (0.26)
North America > United States > California (0.26)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.89)

Add feedback

Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software Deployment

Zhu, Jie, Wang, Leye, Han, Xiao, Liu, Anmin, Xie, Tao

arXiv.org Artificial IntelligenceJan-1-2024

Abstract--The size of deep learning models in artificial intelligence (AI) software is increasing rapidly, hindering the large-scale deployment on resource-restricted devices (e.g., smartphones). To mitigate this issue, AI software compression plays a crucial role, which aims to compress model size while keeping high performance. However, the intrinsic defects in a big model may be inherited by the compressed one. Such defects may be easily leveraged by adversaries, since a compressed model is usually deployed in a large number of devices without adequate protection. In this article, we aim to address the safe model compression problem from the perspective of safety-performance co-optimization. Specifically, inspired by the test-driven development (TDD) paradigm in software engineering, we propose a test-driven sparse training framework called SafeCompress. Then, considering two kinds of representative and heterogeneous attack mechanisms, i.e., black-box membership inference attack and white-box membership inference attack, we develop two concrete instances called BMIA-SafeCompress and WMIA-SafeCompress. Further, we implement another instance called MMIA-SafeCompress by extending SafeCompress to defend against the occasion when adversaries conduct black-box and white-box membership inference attacks simultaneously. We conduct extensive experiments on five datasets for both computer vision and natural language processing tasks. The results show the effectiveness and generalizability of our framework. We also discuss how to adapt SafeCompress to other attacks besides membership inference attack, demonstrating the flexibility of SafeCompress. Currently, AI software, with DNN as representatives, Model compression aims to compress a big DNN model is recognized as an emerging type of software artifact to a smaller one given specific requirements, e.g., parameter (sometimes known as "software 2.0" [2]). Rashly of DNN-based AI software has increased rapidly in recent compressing a model may lead to severe degeneration in the years (mostly because of a trained deep neural network AI software's task performance such as classification accuracy. For instance, a state-of-the-art model of computer To balance memory storage and task performance, many compression vision contains more than 15 billion parameters [3]. A recent approaches have been proposed and deployed [7], natural language model, GPT-3, is even bigger, surpassing [8]. For example, Han et al. [8] prune AlexNet [1] and reduce 175 billion parameters; this situation requires nearly 1TB of its size by 9 times while losing only 0.01% accuracy in image space to store only the model [4].

bmia-safecompress, safecompress, task acc, (15 more...)

arXiv.org Artificial Intelligence

2401.00996

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
(3 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Coverage-based Scene Fuzzing for Virtual Autonomous Driving Testing

Hu, Zhisheng, Guo, Shengjian, Zhong, Zhenyu, Li, Kang

arXiv.org Artificial IntelligenceJun-1-2021

Simulation-based virtual testing has become an essential step to ensure the safety of autonomous driving systems. Testers need to handcraft the virtual driving scenes and configure various environmental settings like surrounding traffic, weather conditions, etc. Due to the huge amount of configuration possibilities, the human efforts are subject to the inefficiency in detecting flaws in industry-class autonomous driving system. This paper proposes a coverage-driven fuzzing technique to automatically generate diverse configuration parameters to form new driving scenes. Experimental results show that our fuzzing method can significantly reduce the cost in deriving new risky scenes from the initial setup designed by testers. We expect automated fuzzing will become a common practice in virtual testing for autonomous driving systems.

configuration, simulation, test case, (13 more...)

arXiv.org Artificial Intelligence

2106.00873

Country:

North America > United States > Idaho > Ada County > Boise (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > Arizona > Pima County > Tucson (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback