AITopics

2503.04262

Country:

Asia (0.14)
Europe > United Kingdom (0.14)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceNov-6-2024

Fair Exploration and Exploitation

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios

In this paper we consider the contextual bandit problem with a finite (or infinite and clustered) context set. We consider the fully adversarial problem in which, apart from having bounded losses, there are no assumptions whatsoever on the generation of the contexts and losses. In our problem we assume that the context set is partitioned into a set of protected groups. At the start of each trial we are given a probability distribution over the context set and are required (on that trial) to be fair with respect to that distribution, in that if the context (for that trial) was drawn from the distribution then our choice of action would be unbiased towards any protected group. We develop an algorithm FexEx for this problem which has remarkable efficiency, having a space and per-trial time complexity at most linear in the dimensionality of the policy space. FexEx can handle non-stationarity, in that its regret can be bounded with respect to any sequence of policies satisfying the fairness constraints. For such a sequence the regret bound of FexEx is essentially the same as that of running Exp3.S for each context independently (an approach that does not satisfy the fairness constraints).

artificial intelligence, data mining, machine learning, (18 more...)

2411.04295

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.50)

Industry:

Education > Educational Setting > Online (0.46)
Energy > Oil & Gas > Upstream (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

arXiv.org Machine LearningMay-31-2024

Online Convex Optimisation: The Optimal Switching Regret for all Segmentations Simultaneously

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios, Herbster, Mark

We consider the classic problem of online convex optimisation. Whereas the notion of static regret is relevant for stationary problems, the notion of switching regret is more appropriate for non-stationary problems. A switching regret is defined relative to any segmentation of the trial sequence, and is equal to the sum of the static regrets of each segment. In this paper we show that, perhaps surprisingly, we can achieve the asymptotically optimal switching regret on every possible segmentation simultaneously. Our algorithm for doing so is very efficient: having a space and per-trial time complexity that is logarithmic in the time-horizon. Our algorithm also obtains novel bounds on its dynamic regret: being adaptive to variations in the rate of change of the comparator sequence.

artificial intelligence, inductive hypothesis, machine learning, (13 more...)

2405.20824

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

arXiv.org Artificial IntelligenceMar-4-2024

Fusion Encoder Networks

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios

The resulting neural network has only logarithmic depth (alleviating the degradation of data as it propagates through the network) and can process sequences in linear time (or in logarithmic time with a linear number of processors). The crucial property of FENs is that they learn by training a quasi-linear number of constant-depth feed-forward neural networks in parallel. The fact that these networks have constant depth means that backpropagation works well. We note that currently the performance of FENs is only conjectured as we are yet to implement them.

artificial intelligence, machine learning, neural network, (16 more...)

2402.15883

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

arXiv.org Machine LearningFeb-22-2024

Bandits with Abstention under Expert Advice

Pasteris, Stephen, Rumi, Alberto, Thiessen, Maximilian, Saito, Shota, Miyauchi, Atsushi, Vitale, Fabio, Herbster, Mark

We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner's abstention from play, has no reward or loss on every trial. We propose the CBA algorithm, which exploits this assumption to obtain reward bounds that can significantly improve those of the classical Exp4 algorithm. We can view our problem as the aggregation of confidence-rated predictors when the learner has the option of abstention from play. Importantly, we are the first to achieve bounds on the expected cumulative reward for general confidence-rated predictors. In the special case of specialists we achieve a novel reward bound, significantly improving previous bounds of SpecialistExp (treating abstention as another action). As an example application, we discuss learning unions of balls in a finite metric space. In this contextual setting, we devise an efficient implementation of CBA, reducing the runtime from quadratic to almost linear in the number of contexts. Preliminary experiments show that CBA improves over existing bandit algorithms.

artificial intelligence, data mining, machine learning, (20 more...)

2402.14585

Country: Europe > Italy (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceDec-29-2023

Adversarial Online Collaborative Filtering

Pasteris, Stephen, Vitale, Fabio, Herbster, Mark, Gentile, Claudio, Panisson, Andre'

We investigate the problem of online collaborative filtering under no-repetition constraints, whereby users need to be served content in an online fashion and a given user cannot be recommended the same content item more than once. We start by designing and analyzing an algorithm that works under biclustering assumptions on the user-item preference matrix, and show that this algorithm exhibits an optimal regret guarantee, while being fully adaptive, in that it is oblivious to any prior knowledge about the sequence of users, the universe of items, as well as the biclustering parameters of the preference matrix. We then propose a more robust version of this algorithm which operates with general matrices. Also this algorithm is parameter free, and we prove regret guarantees that scale with the amount by which the preference matrix deviates from a biclustered structure. To our knowledge, these are the first results on online collaborative filtering that hold at this level of generality and adaptivity under no-repetition constraints. Finally, we complement our theoretical findings with simple experiments on real-world datasets aimed at both validating the theory and empirically comparing to standard baselines. This comparison shows the competitive advantage of our approach over these baselines.

algorithm, artificial intelligence, social media, (16 more...)

2302.05765

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

arXiv.org Machine LearningDec-14-2023

A Hierarchical Nearest Neighbour Approach to Contextual Bandits

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios

In this paper we consider the adversarial contextual bandit problem in metric spaces. The paper "Nearest neighbour with bandit feedback" tackled this problem but when there are many contexts near the decision boundary of the comparator policy it suffers from a high regret. In this paper we eradicate this problem, designing an algorithm in which we can hold out any set of contexts when computing our regret term. Our algorithm builds on that of "Nearest neighbour with bandit feedback" and hence inherits its extreme computational efficiency.

artificial intelligence, data mining, machine learning, (14 more...)

2312.09332

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

arXiv.org Artificial IntelligenceNov-10-2023

Sum-max Submodular Bandits

Pasteris, Stephen, Rumi, Alberto, Vitale, Fabio, Cesa-Bianchi, Nicolò

Many online decision-making problems correspond to maximizing a sequence of submodular functions. In this work, we introduce sum-max functions, a subclass of monotone submodular functions capturing several interesting problems, including best-of-$K$-bandits, combinatorial bandits, and the bandit versions on facility location, $M$-medians, and hitting sets. We show that all functions in this class satisfy a key property that we call pseudo-concavity. This allows us to prove $\big(1 - \frac{1}{e}\big)$-regret bounds for bandit feedback in the nonstochastic setting of the order of $\sqrt{MKT}$ (ignoring log factors), where $T$ is the time horizon and $M$ is a cardinality constraint. This bound, attained by a simple and efficient algorithm, significantly improves on the $\widetilde{O}\big(T^{2/3}\big)$ regret bound for online monotone submodular maximization with bandit feedback.

artificial intelligence, data mining, machine learning, (15 more...)

2311.05975

Country:

Europe > Italy (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Data Science > Data Mining > Big Data (0.46)

arXiv.org Artificial IntelligenceAug-2-2023

Nearest Neighbour with Bandit Feedback

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios

In this paper we adapt the nearest neighbour rule to the contextual bandit problem. Our algorithm handles the fully adversarial setting in which no assumptions at all are made about the data-generation process. When combined with a sufficiently fast data-structure for (perhaps approximate) adaptive nearest neighbour search, such as a navigating net, our algorithm is extremely efficient - having a per trial running time polylogarithmic in both the number of trials and actions, and taking only quasi-linear space.

artificial intelligence, data mining, machine learning, (19 more...)

2306.13773

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)

arXiv.org Machine LearningAug-16-2020

Online Multitask Learning with Long-Term Memory

Herbster, Mark, Pasteris, Stephen, Tse, Lisa

We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms that are designed to exploit the scenario where there are many such segments but significantly fewer associated hypotheses. We prove regret bounds that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting this is equivalent to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]. We provide an algorithm that predicts on each trial in time linear in the number of hypotheses when the hypothesis class is finite. We also consider infinite hypothesis classes from reproducing kernel Hilbert spaces for which we give an algorithm whose per trial time complexity is cubic in the number of cumulative trials. In the single-task special case this is the first example of an efficient regret-bounded switching algorithm with long-term memory for a non-parametric hypothesis class.

algorithm, artificial intelligence, neural network, (20 more...)

2008.07055

Country:

North America > Canada > Quebec (0.14)
Europe > United Kingdom > England (0.14)
North America > United States > Oregon (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.63)
Overview (0.45)
Instructional Material > Online (0.40)

Industry: Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)