vulnerability
- Government > Military (0.93)
- Information Technology > Security & Privacy (0.68)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
A Broader Impact
Our work designs privacy attacks, which have the potential to cause harm. The main limitation of our work is the strong threat model under which our attacks work. All of our results on CIFAR-10 make use of fewer than 30000 trained models. We plot the effectiveness of Transfer LiRA in Figure 7. ROC curves for our student attacks are found Further qualitative examples can be found in Figure 9. Ablation of score information CIFAR-10 with duplicates are found in Figure 11. Distillation threat models, which we will consider simultaneously.
Students Parrot Their Teachers: Membership Inference on Model Distillation Matthew Jagielski
Model distillation is frequently proposed as a technique to reduce the privacy leakage of machine learning. These empirical privacy defenses rely on the intuition that distilled "student" models protect the privacy of training data, as they only interact with this data indirectly through a "teacher" model. In this work, we design membership inference attacks to systematically study the privacy provided by knowledge distillation to both the teacher and student training sets. Our new attacks show that distillation alone provides only limited privacy across a number of domains. We explain the success of our attacks on distillation by showing that membership inference attacks on a private dataset can succeed even if the target model is never queried on any actual training points, but only on inputs whose predictions are highly influenced by training data. Finally, we show that our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Texas (0.05)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Virginia (0.04)
- North America > United States > Maryland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
Is a secure AI assistant possible?
AI agents are a risky business. Even when stuck inside the chatbox window, LLMs will make mistakes and behave badly. Once they have tools that they can use to interact with the outside world, such as web browsers and email addresses, the consequences of those mistakes become far more serious. That might explain why the first breakthrough LLM personal assistant came not from one of the major AI labs, which have to worry about reputation and liability, but from an independent software engineer, Peter Steinberger. In November of 2025, Steinberger uploaded his tool, now called OpenClaw, to GitHub, and in late January the project went viral.
- Asia > China (0.15)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- South America > Brazil (0.04)
- North America > United States (0.04)
- Asia > India (0.04)
- (47 more...)
- South America > Brazil (0.04)
- North America > United States (0.04)
- Asia > Indonesia (0.04)
- (47 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch
As the curation of data for machine learning becomes increasingly automated, dataset tampering is a mounting threat. Backdoor attackers tamper with training data to embed a vulnerability in models that are trained on that data. This vulnerability is then activated at inference time by placing a trigger'' into the model's input. Typical backdoor attacks insert the trigger directly into the training data, although the presence of such an attack may be visible upon inspection. In contrast, the Hidden Trigger Backdoor Attack achieves poisoning without placing a trigger into the training data at all. However, this hidden trigger attack is ineffective at poisoning neural networks trained from scratch. We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process. Sleeper Agent is the first hidden trigger backdoor attack to be effective against neural networks trained from scratch. We demonstrate its effectiveness on ImageNet and in black-box settings.
Jeffrey Epstein Had a 'Personal Hacker,' Informant Claims
Security News This Week: Jeffrey Epstein Had a'Personal Hacker,' Informant Claims Plus: AI agent OpenClaw gives cybersecurity experts the willies, China executes 11 scam compound bosses, a $40 million crypto theft has an unexpected alleged culprit, and more. As the standoff between the United States government and Minnesota continues this week over immigration enforcement operations that have essentially occupied the Twin Cities and other parts of the state, a federal judge delayed a decision this week and ordered a new briefing on whether the Department of Homeland Security is using armed raids to pressure Minnesota into abandoning its sanctuary policies for immigrants. Meanwhile, minutes after a federal immigration officer shot and killed 37-year-old Alex Pretti in Minneapolis last Saturday, Trump administration officials and right-wing influencers had already mounted a smear campaign, calling Pretti a "terrorist" and a "lunatic ." As part of its surveillance dragnet, Immigration and Customs Enforcement has been using an AI-powered Palantir system since last spring to summarize tips sent to its tip line, according to a newly released Homeland Security document. DHS immigration agents have also been using the now notorious face recognition app Mobile Fortify to scan the faces of countless people in the US--including many citizens .
- South America > Venezuela (0.48)
- Asia > China (0.36)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.25)
- (12 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Mobile (0.69)