AITopics | ar-ucb

Collaborating Authors

ar-ucb

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Autoregressive Bandits

Bacchiocchi, Francesco, Genalti, Gianmarco, Maran, Davide, Mussi, Marco, Restelli, Marcello, Gatti, Nicola, Metelli, Alberto Maria

arXiv.org Artificial IntelligenceDec-12-2022

In this paper, we model the reward of Autoregressive processes naturally arise in a a sequential decision-making problem as an AR process large variety of real-world scenarios, including where its parameters depend on the action selected by the e.g., stock markets, sell forecasting, weather agent at every round. This scenario can be regarded as an prediction, advertising, and pricing. When extension of the multi-armed bandit (MAB, Lattimore & addressing a sequential decision-making problem Szepesvári, 2020) problem, in which an AR process governs in such a context, the temporal dependence the temporal structure of the observed rewards that between consecutive observations should be is, through the action-dependent AR parameters, that are properly accounted for converge to the optimal unknown to the agent. It is worth mentioning that such decision policy. In this work, we propose a novel a scenario displays notable differences compared to more online learning setting, named Autoregressive traditional non-stationary MABs (Gur et al., 2014). Indeed, Bandits (ARBs), in which the observed reward in the presented scenario, we can exploit the knowledge that follows an autoregressive process of order k, the underlying process is AR and, more importantly, that whose parameters depend on the action the such a dynamic depends on the agent's action.

ar-ucb, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.06251

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Wang, Siwei, Wang, Haoyun, Huang, Longbo

arXiv.org Machine LearningDec-15-2020

We study the multi-armed bandit (MAB) problem with composite and anonymous feedback. In this model, the reward of pulling an arm spreads over a period of time (we call this period as reward interval) and the player receives partial rewards of the action, convoluted with rewards from pulling other arms, successively. Existing results on this model require prior knowledge about the reward interval size as an input to their algorithms. In this paper, we propose adaptive algorithms for both the stochastic and the adversarial cases, without requiring any prior information about the reward interval. For the stochastic case, we prove that our algorithm guarantees a regret that matches the lower bounds (in order). For the adversarial case, we propose the first algorithm to jointly handle non-oblivious adversary and unknown reward interval size. We also conduct simulations based on real-world dataset. The results show that our algorithms outperform existing benchmarks.

algorithm, ar-ucb, cesa-bianchi, (15 more...)

arXiv.org Machine Learning

2012.07048

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.84)

Add feedback