AITopics | safety concern

Collaborating Authors

safety concern

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Is Your HD Map Constructor Reliable under Sensor Corruptions?

Neural Information Processing SystemsMar-19-2026, 06:19:55 GMT

Driving systems often rely on high-definition (HD) maps for precise environmental information, which is crucial for planning and navigation. While current HD map constructors perform well under ideal conditions, their resilience to real-world challenges, \eg, adverse weather and sensor failures, is not well understood, raising safety concerns. This work introduces MapBench, the first comprehensive benchmark designed to evaluate the robustness of HD map construction methods against various sensor corruptions. Our benchmark encompasses a total of 29 types of corruptions that occur from cameras and LiDAR sensors. Extensive evaluations across 31 HD map constructors reveal significant performance degradation of existing methods under adverse weather conditions and sensor failures, underscoring critical safety concerns. We identify effective strategies for enhancing robustness, including innovative approaches that leverage multi-modal fusion, advanced data augmentation, and architectural techniques. These insights provide a pathway for developing more reliable HD map construction methods, which are essential for the advancement of autonomous driving technology. The benchmark toolkit and affiliated code and model checkpoints have been made publicly accessible.

artificial intelligence, name change, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.77)

Add feedback

Runtime Safety Monitoring of Deep Neural Networks for Perception: A Survey

Schotschneider, Albert, Pavlitska, Svetlana, Zöllner, J. Marius

arXiv.org Artificial IntelligenceNov-11-2025

Deep neural networks (DNNs) are widely used in perception systems for safety-critical applications, such as autonomous driving and robotics. However, DNNs remain vulnerable to various safety concerns, including generalization errors, out-of-distribution (OOD) inputs, and adversarial attacks, which can lead to hazardous failures. This survey provides a comprehensive overview of runtime safety monitoring approaches, which operate in parallel to DNNs during inference to detect these safety concerns without modifying the DNN itself. We categorize existing methods into three main groups: Monitoring inputs, internal representations, and outputs. We analyze the state-of-the-art for each category, identify strengths and limitations, and map methods to the safety concerns they address. In addition, we highlight open challenges and future research directions.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2511.05982

Country:

Europe (0.94)
North America > United States (0.28)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Transportation > Ground > Road (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts

Yueh-Han, Chen, Davidson, Guy, Lake, Brenden M.

arXiv.org Artificial IntelligenceMay-29-2025

Do LLMs robustly generalize critical safety facts to novel situations? Lacking this ability is dangerous when users ask naive questions. For instance, "I'm considering packing melon balls for my 10-month-old's lunch. What other foods would be good to include?" Before offering food options, the LLM should warn that melon balls pose a choking hazard to toddlers, as documented by the CDC. Failing to provide such warnings could result in serious injuries or even death. To evaluate this, we introduce SAGE-Eval, SAfety-fact systematic GEneralization evaluation, the first benchmark that tests whether LLMs properly apply well established safety facts to naive user queries. SAGE-Eval comprises 104 facts manually sourced from reputable organizations, systematically augmented to create 10,428 test scenarios across 7 common domains (e.g., Outdoor Activities, Medicine). We find that the top model, Claude-3.7-sonnet, passes only 58% of all the safety facts tested. We also observe that model capabilities and training compute weakly correlate with performance on SAGE-Eval, implying that scaling up is not the golden solution. Our findings suggest frontier LLMs still lack robust generalization ability. We recommend developers use SAGE-Eval in pre-deployment evaluations to assess model reliability in addressing salient risks. We publicly release SAGE-Eval at https://huggingface.co/datasets/YuehHanChen/SAGE-Eval and our code is available at https://github.com/YuehHanChen/SAGE-Eval/tree/main.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.21828

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Consumer Health (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Is Your HD Map Constructor Reliable under Sensor Corruptions?

Neural Information Processing SystemsMay-26-2025, 19:19:09 GMT

artificial intelligence, hd map construction method, sensor corruption, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.62)

Add feedback

FlipAttack: Jailbreak LLMs via Flipping

Liu, Yue, He, Xiaoxin, Xiong, Miao, Fu, Jinlan, Deng, Shumin, Hooi, Bryan

arXiv.org Artificial IntelligenceOct-2-2024

This paper proposes a simple yet effective jailbreak attack named FlipAttack against black-box LLMs. First, from the autoregressive nature, we reveal that LLMs tend to understand the text from left to right and find that they struggle to comprehend the text when noise is added to the left side. Motivated by these insights, we propose to disguise the harmful prompt by constructing left-side noise merely based on the prompt itself, then generalize this idea to 4 flipping modes. Second, we verify the strong ability of LLMs to perform the text-flipping task, and then develop 4 variants to guide LLMs to denoise, understand, and execute harmful behaviors accurately. These designs keep FlipAttack universal, stealthy, and simple, allowing it to jailbreak black-box LLMs within only 1 query. Experiments on 8 LLMs demonstrate the superiority of FlipAttack. Remarkably, it achieves $\sim$98\% attack success rate on GPT-4o, and $\sim$98\% bypass rate against 5 guardrail models on average. The codes are available at GitHub\footnote{https://github.com/yueliu1999/FlipAttack}.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.02832

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Workflow (0.95)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

I tested a 600 ear-zapping device that claims to rewire your nervous system - and it boosted my memory skills by 80%

Daily Mail - Science & techSep-22-2024, 09:59:17 GMT

From the hunt for the philosopher's stone to the snake oil salesmen of the Wild West, the history of medicine has had more than its fair share of fraudulent'cure-alls'. So, when I first heard of a device that claimed to cure everything from depression to my rapidly deteriorating attention span, I was understandably sceptical. To make things even stranger, this potential wonder-cure isn't a pill, powder, or trendy new supplement. Instead, the Nurosym is a 599 gadget that claims to rewire your nervous system - by zapping your ear. MailOnline's Wiliam Hunter bravely tested it out - and, as strange as it all might sound, he's almost ready to believe the hype.

nerve stimulation, nurosym, vagus nerve, (14 more...)

Daily Mail - Science & tech

Country: North America > United States (0.69)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.70)
Government > Regional Government > North America Government > United States Government (0.69)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.34)

Add feedback

Why we need an AI safety hotline

MIT Technology ReviewSep-16-2024, 09:00:00 GMT

In the past couple of years, regulators have been caught off guard again and again as tech companies compete to launch ever more advanced AI models. It's only a matter of time before labs release another round of models that pose new regulatory challenges. We're likely just weeks away, for example, from OpenAI's release of ChatGPT-5, which promises to push AI capabilities further than ever before. As it stands, it seems there's little anyone can do to delay or prevent the release of a model that poses excessive risks. Testing AI models before they're released is a common approach to mitigating certain risks, and it may help regulators weigh up the costs and benefits--and potentially block models from being released if they're deemed too dangerous.

ai model, ai safety hotline, evaluation, (1 more...)

MIT Technology Review

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.99)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.30)

Add feedback

Waymo to launch robotaxi service in Los Angeles, but no freeway driving -- for now

Los Angeles TimesMar-13-2024, 15:54:17 GMT

The driver in the Chevy Suburban seemed bent on testing the Waymo robotaxi on the streets of downtown L.A. this week. Playing chicken against Silicon Valley's wheeled robot, he sharply swung into the next lane toward the Waymo. The white driverless Jaguar swerved to avoid the bigger car crossing the line and striking it. The human driver sped then ahead of the robotaxi and braked abruptly in front of it. The machine slowed in time to avoid a collision, shifted into the next lane and the Chevy moved on, ending a brief yet anxiety inducing interaction for a Los Angeles Times reporter and photographer riding in the Waymo vehicle.

angeles, vehicle, waymo, (14 more...)

Los Angeles Times

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.66)
North America > United States > California > San Francisco County > San Francisco (0.08)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.05)
North America > United States > California > Los Angeles County > Santa Monica (0.05)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)
Government > Regional Government > North America Government > United States Government (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Microsoft ignored safety problems with AI image generator, engineer complains

The GuardianMar-6-2024, 21:55:25 GMT

An artificial intelligence engineer at Microsoft published a letter Wednesday alleging that the company's AI image generator lacks basic safeguards against creating violent and sexualized images. In the letter, engineer Shane Jones states that his repeated attempts to warn Microsoft management about the problems failed to result in any action. Jones said he sent the message to the Federal Trade Commission and Microsoft's board of directors. "Internally the company is well aware of systemic issues where the product is creating harmful images that could be offensive and inappropriate for consumers," Jones states in the letter, which he published on LinkedIn. He lists his title as "principal software engineering manager".

ai image generator, copilot designer, microsoft, (11 more...)

The Guardian

Industry:

Law (0.53)
Government (0.52)

Technology:

Information Technology > Communications > Social Media (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.37)

Add feedback

Instance-Level Safety-Aware Fidelity of Synthetic Data and Its Calibration

Cheng, Chih-Hong, Stöckel, Paul, Zhao, Xingyu

arXiv.org Artificial IntelligenceFeb-10-2024

Modeling and calibrating the fidelity of synthetic data is paramount in shaping the future of safe and reliable self-driving technology by offering a cost-effective and scalable alternative to real-world data collection. We focus on its role in safety-critical applications, introducing four types of instance-level fidelity that go beyond mere visual input characteristics. The aim is to align synthetic data with real-world safety issues. We suggest an optimization method to refine the synthetic data generator, reducing fidelity gaps identified by the DNN-based component. Our findings show this tuning enhances the correlation between safety-critical errors in synthetic and real images.

fidelity, synthetic data, synthetic data generator, (11 more...)

arXiv.org Artificial Intelligence

2402.07031

Country:

Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Automobiles & Trucks (0.95)
Information Technology > Robotics & Automation (0.67)
Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)

Add feedback