AITopics | scale free and non-parametric algorithm

Collaborating Authors

scale free and non-parametric algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Maximum Average Randomly Sampled: A Scale Free and Non-parametric Algorithm for Stochastic Bandits

Neural Information Processing SystemsDec-26-2025, 15:15:37 GMT

Upper Confidence Bound (UCB) methods are one of the most effective methods in dealing with the exploration-exploitation trade-off in online decision-making problems. The confidence bounds utilized in UCB methods tend to be constructed based on concentration equalities which are usually dependent on a parameter of scale (e.g. a bound on the payoffs, a variance, or a subgaussian parameter) that must be known in advance. The necessity of knowing a scale parameter a priori and the fact that the confidence bounds only use the tail information can deteriorate the performance of the UCB methods.Here we propose a data-dependent UCB algorithm called MARS (Maximum Average Randomly Sampled) in a non-parametric setup for multi-armed bandits with symmetric rewards. The algorithm does not depend on any scaling, and the data-dependent upper confidence bound is constructed based on the maximum average of randomly sampled rewards inspired by the work of Hartigan in the 1960s and 70s. A regret bound for the multi-armed bandit problem is derived under the same assumptions as for the $\psi$-UCB method without incorporating any correction factors. The method is illustrated and compared with baseline algorithms in numerical experiments.

name change, sampled, scale free and non-parametric algorithm, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Supplement to " Maximum Average Randomly Sampled: A Scale Free and Non-parametric Algorithm for Stochastic Bandits "

Neural Information Processing SystemsNov-20-2025, 21:32:31 GMT

The following lemma given in [2] is useful for the proof of Theorem 1. Lemma 1. [2] Given a stochastic matrix H = 0 0 0 h The following propositions are used to prove this theorem. In this case, there is not enough observations to achieve an upper confidence bound using Proposition 2. The randomized UCB for this case has also an exact confidence as illustrated below: Pr{UCB In the second equality, the boundedness of the means of the arms and Proposition 1 were utilized. The steps in this proof closely follows the proof of Theorem 7.1 in [3]. Let us define a'good' event as G We are going to show 1. The next step is to bound the probability of the second set in (3).

artificial intelligence, exp, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Supplement to " Maximum Average Randomly Sampled: A Scale Free and Non-parametric Algorithm for Stochastic Bandits "

Neural Information Processing SystemsOct-9-2025, 05:41:19 GMT

artificial intelligence, exp, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Maximum Average Randomly Sampled: A Scale Free and Non-parametric Algorithm for Stochastic Bandits

Neural Information Processing SystemsJan-19-2025, 20:20:07 GMT

Upper Confidence Bound (UCB) methods are one of the most effective methods in dealing with the exploration-exploitation trade-off in online decision-making problems. The confidence bounds utilized in UCB methods tend to be constructed based on concentration equalities which are usually dependent on a parameter of scale (e.g. a bound on the payoffs, a variance, or a subgaussian parameter) that must be known in advance. The necessity of knowing a scale parameter a priori and the fact that the confidence bounds only use the tail information can deteriorate the performance of the UCB methods.Here we propose a data-dependent UCB algorithm called MARS (Maximum Average Randomly Sampled) in a non-parametric setup for multi-armed bandits with symmetric rewards. The algorithm does not depend on any scaling, and the data-dependent upper confidence bound is constructed based on the maximum average of randomly sampled rewards inspired by the work of Hartigan in the 1960s and 70s. A regret bound for the multi-armed bandit problem is derived under the same assumptions as for the \psi -UCB method without incorporating any correction factors.

sampled, scale free and non-parametric algorithm, stochastic bandit, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback