AITopics | roper

We study nonparametric contextual bandits where Lipschitz mean reward functions may change over time. We first establish the minimax dynamic regret rate in this less understood setting in terms of number of changes $L$ and total-variation $V$, both capturing all changes in distribution over context space, and argue that state-of-the-art procedures are suboptimal in this setting. Next, we tend to the question of an adaptivity for this setting, i.e. achieving the minimax rate without knowledge of $L$ or $V$. Quite importantly, we posit that the bandit problem, viewed locally at a given context $X_t$, should not be affected by reward changes in other parts of context space $\cal X$. We therefore propose a notion of change, which we term experienced significant shifts, that better accounts for locality, and thus counts considerably less changes than $L$ and $V$. Furthermore, similar to recent work on non-stationary MAB (Suk & Kpotufe, 2022), experienced significant shifts only count the most significant changes in mean rewards, e.g., severe best-arm changes relevant to observed contexts. Our main result is to show that this more tolerant notion of change can in fact be adapted to.

base-alg, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2307.05341

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)

Add feedback

The Strangely Believable Tale of a Mythical Rogue Drone

WIREDJun-8-2023, 16:00:00 GMT

Did you hear about the Air Force AI drone that went rogue and attacked its operators inside a simulation? The cautionary tale was told by Colonel Tucker Hamilton, chief of AI test and operations at the US Air Force, during a speech at an aerospace and defense event in London late last month. It apparently involved taking the kind of learning algorithm that has been used to train computers to play video games and board games like Chess and Go and using it to train a drone to hunt and destroy surface-to-air missiles. "At times, the human operator would tell it not to kill that threat, but it got its points by killing that threat," Hamilton was widely reported as telling the audience in London. It sounds like just the sort of thing AI experts have begun warning that increasingly clever and maverick algorithms might do.

artificial intelligence, machine learning, natural language, (18 more...)

WIRED

Industry:

Leisure & Entertainment > Games (1.00)
Government > Military > Air Force (0.83)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.40)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.34)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Filters

Collaborating Authors

roper

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

13b501c58ae3bfe9635a259f4414e943-Supplemental-Conference.pdf

13b501c58ae3bfe9635a259f4414e943-Paper-Conference.pdf

78ccee9dfbcf84840165ab4093715969-Supplemental-Conference.pdf

When Can We Track Significant Preference Shifts in Dueling Bandits?

13b501c58ae3bfe9635a259f4414e943-Supplemental-Conference.pdf

13b501c58ae3bfe9635a259f4414e943-Paper-Conference.pdf

78ccee9dfbcf84840165ab4093715969-Supplemental-Conference.pdf

When Can We Track Significant Preference Shifts in Dueling Bandits?

Tracking Most Significant Shifts in Nonparametric Contextual Bandits

The Strangely Believable Tale of a Mythical Rogue Drone