Efficient Action Poisoning Attacks on Linear Contextual Bandits
Multiple armed bandits (MABs), a popular framework of sequential decision making model, has been widely investigated and has many applicants in a variety of scenarios [1, 2, 3]. The contextual bandits model is an extension of the multi-armed bandits model with contextual information. At each round, the reward is associated with both the arm (a.k.a, action) and the context, while the reward of stochastic MABs is only associated with the arm. Contextual bandits algorithms have a broad range of applications, such as recommender systems [4], wireless networks [5], etc. In the modern industry-scale applications of bandit algorithms, action decisions, reward signal collection, and policy iterations are normally implemented in a distributed network.
Dec-10-2021
- Country:
- North America
- United States
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- California
- Yolo County > Davis (0.04)
- Los Angeles County
- Los Angeles (0.14)
- Long Beach (0.04)
- New York
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe > France
- Hauts-de-France > Nord > Lille (0.04)
- North America
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (0.96)
- Technology: