Efficient Action Poisoning Attacks on Linear Contextual Bandits

Dec-10-2021–arXiv.org Machine Learning

Multiple armed bandits (MABs), a popular framework of sequential decision making model, has been widely investigated and has many applicants in a variety of scenarios [1, 2, 3]. The contextual bandits model is an extension of the multi-armed bandits model with contextual information. At each round, the reward is associated with both the arm (a.k.a, action) and the context, while the reward of stochastic MABs is only associated with the arm. Contextual bandits algorithms have a broad range of applications, such as recommender systems [4], wireless networks [5], etc. In the modern industry-scale applications of bandit algorithms, action decisions, reward signal collection, and policy iterations are normally implemented in a distributed network.

data mining, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

Dec-10-2021

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York
      - Richmond County > New York City (0.04)
      - Queens County > New York City (0.04)
      - New York County > New York City (0.04)
      - Kings County > New York City (0.04)
      - Bronx County > New York City (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - California
      - Yolo County > Davis (0.04)
      - Los Angeles County
        Los Angeles (0.14)
        Long Beach (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > France
  - Hauts-de-France > Nord > Lille (0.04)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (0.96)

Technology:
- Information Technology
  - Communications (1.00)
  - Security & Privacy (0.96)
  - Data Science > Data Mining
    - Big Data (0.76)
  - Artificial Intelligence > Machine Learning
    - Reinforcement Learning (1.00)