target arm
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report (0.34)
- Overview (0.34)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.47)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.94)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.91)
- North America > United States (0.04)
- North America > Canada (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
- Information Technology > Data Science (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report (0.34)
- Overview (0.34)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Education > Educational Setting > Online (0.41)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.94)
Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection
Zeng, Qirun, He, Eric, Hoffmann, Richard, Wang, Xuchuang, Zuo, Jinhang
Adversarial attacks on stochastic bandits have traditionally relied on some unrealistic assumptions, such as per-round reward manipulation and unbounded perturbations, limiting their relevance to real-world systems. We propose a more practical threat model, Fake Data Injection, which reflects realistic adversarial constraints: the attacker can inject only a limited number of bounded fake feedback samples into the learner's history, simulating legitimate interactions. We design efficient attack strategies under this model, explicitly addressing both magnitude constraints (on reward values) and temporal constraints (on when and how often data can be injected). Our theoretical analysis shows that these attacks can mislead both Upper Confidence Bound (UCB) and Thompson Sampling algorithms into selecting a target arm in nearly all rounds while incurring only sublinear attack cost. Experiments on synthetic and real-world datasets validate the effectiveness of our strategies, revealing significant vulnerabilities in widely used stochastic bandit algorithms under practical adversarial scenarios.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
Wang, Zhiwei, Wang, Huazheng, Wang, Hongning
Adversarial attacks against stochastic multi-armed bandit (MAB) algorithms have been extensively studied in the literature. In this work, we focus on reward poisoning attacks and find most existing attacks can be easily detected by our proposed detection method based on the test of homogeneity, due to their aggressive nature in reward manipulations. This motivates us to study the notion of stealthy attack against stochastic MABs and investigate the resulting attackability. Our analysis shows that against two popularly employed MAB algorithms, UCB1 and $\epsilon$-greedy, the success of a stealthy attack depends on the environmental conditions and the realized reward of the arm pulled in the first round. We also analyze the situation for general MAB algorithms equipped with our attack detection method and find that it is possible to have a stealthy attack that almost always succeeds. This brings new insights into the security risks of MAB algorithms.
- North America > United States > Oregon (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
- Information Technology > Data Science > Data Mining > Big Data (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Adversarial Attacks on Cooperative Multi-agent Bandits
Zuo, Jinhang, Zhang, Zhiyao, Wang, Xuchuang, Chen, Cheng, Li, Shuai, Lui, John C. S., Hajiesmaili, Mohammad, Wierman, Adam
Cooperative multi-agent multi-armed bandits (CMA2B) consider the collaborative efforts of multiple agents in a shared multi-armed bandit game. We study latent vulnerabilities exposed by this collaboration and consider adversarial attacks on a few agents with the goal of influencing the decisions of the rest. More specifically, we study adversarial attacks on CMA2B in both homogeneous settings, where agents operate with the same arm set, and heterogeneous settings, where agents have distinct arm sets. In the homogeneous setting, we propose attack strategies that, by targeting just one agent, convince all agents to select a particular target arm $T-o(T)$ times while incurring $o(T)$ attack costs in $T$ rounds. In the heterogeneous setting, we prove that a target arm attack requires linear attack costs and propose attack strategies that can force a maximum number of agents to suffer linear regrets while incurring sublinear costs and only manipulating the observations of a few target agents. Numerical experiments validate the effectiveness of our proposed attack strategies.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)