AITopics | Herbster, Mark

Collaborating Authors

Herbster, Mark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online Convex Optimisation: The Optimal Switching Regret for all Segmentations Simultaneously

Pasteris, Stephen, Hicks, Chris, Mavroudis, Vasilios, Herbster, Mark

arXiv.org Machine LearningMay-31-2024

We consider the classic problem of online convex optimisation. Whereas the notion of static regret is relevant for stationary problems, the notion of switching regret is more appropriate for non-stationary problems. A switching regret is defined relative to any segmentation of the trial sequence, and is equal to the sum of the static regrets of each segment. In this paper we show that, perhaps surprisingly, we can achieve the asymptotically optimal switching regret on every possible segmentation simultaneously. Our algorithm for doing so is very efficient: having a space and per-trial time complexity that is logarithmic in the time-horizon. Our algorithm also obtains novel bounds on its dynamic regret: being adaptive to variations in the rate of change of the comparator sequence.

artificial intelligence, inductive hypothesis, machine learning, (13 more...)

arXiv.org Machine Learning

2405.20824

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Bandits with Abstention under Expert Advice

Pasteris, Stephen, Rumi, Alberto, Thiessen, Maximilian, Saito, Shota, Miyauchi, Atsushi, Vitale, Fabio, Herbster, Mark

arXiv.org Machine LearningFeb-22-2024

We study the classic problem of prediction with expert advice under bandit feedback. Our model assumes that one action, corresponding to the learner's abstention from play, has no reward or loss on every trial. We propose the CBA algorithm, which exploits this assumption to obtain reward bounds that can significantly improve those of the classical Exp4 algorithm. We can view our problem as the aggregation of confidence-rated predictors when the learner has the option of abstention from play. Importantly, we are the first to achieve bounds on the expected cumulative reward for general confidence-rated predictors. In the special case of specialists we achieve a novel reward bound, significantly improving previous bounds of SpecialistExp (treating abstention as another action). As an example application, we discuss learning unions of balls in a finite metric space. In this contextual setting, we devise an efficient implementation of CBA, reducing the runtime from quadratic to almost linear in the number of contexts. Preliminary experiments show that CBA improves over existing bandit algorithms.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2402.14585

Country: Europe > Italy (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Adversarial Online Collaborative Filtering

Pasteris, Stephen, Vitale, Fabio, Herbster, Mark, Gentile, Claudio, Panisson, Andre'

arXiv.org Artificial IntelligenceDec-29-2023

We investigate the problem of online collaborative filtering under no-repetition constraints, whereby users need to be served content in an online fashion and a given user cannot be recommended the same content item more than once. We start by designing and analyzing an algorithm that works under biclustering assumptions on the user-item preference matrix, and show that this algorithm exhibits an optimal regret guarantee, while being fully adaptive, in that it is oblivious to any prior knowledge about the sequence of users, the universe of items, as well as the biclustering parameters of the preference matrix. We then propose a more robust version of this algorithm which operates with general matrices. Also this algorithm is parameter free, and we prove regret guarantees that scale with the amount by which the preference matrix deviates from a biclustered structure. To our knowledge, these are the first results on online collaborative filtering that hold at this level of generality and adaptivity under no-repetition constraints. Finally, we complement our theoretical findings with simple experiments on real-world datasets aimed at both validating the theory and empirically comparing to standard baselines. This comparison shows the competitive advantage of our approach over these baselines.

algorithm, artificial intelligence, social media, (16 more...)

arXiv.org Artificial Intelligence

2302.05765

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Multi-class Graph Clustering via Approximated Effective $p$-Resistance

Saito, Shota, Herbster, Mark

arXiv.org Artificial IntelligenceJul-18-2023

This paper develops an approximation to the (effective) $p$-resistance and applies it to multi-class clustering. Spectral methods based on the graph Laplacian and its generalization to the graph $p$-Laplacian have been a backbone of non-euclidean clustering techniques. The advantage of the $p$-Laplacian is that the parameter $p$ induces a controllable bias on cluster structure. The drawback of $p$-Laplacian eigenvector based methods is that the third and higher eigenvectors are difficult to compute. Thus, instead, we are motivated to use the $p$-resistance induced by the $p$-Laplacian for clustering. For $p$-resistance, small $p$ biases towards clusters with high internal connectivity while large $p$ biases towards clusters of small "extent," that is a preference for smaller shortest-path distances between vertices in the cluster. However, the $p$-resistance is expensive to compute. We overcome this by developing an approximation to the $p$-resistance. We prove upper and lower bounds on this approximation and observe that it is exact when the graph is a tree. We also provide theoretical justification for the use of $p$-resistance for clustering. Finally, we provide experiments comparing our approximated $p$-resistance clustering to other $p$-Laplacian based methods.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.08617

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Online Multitask Learning with Long-Term Memory

Herbster, Mark, Pasteris, Stephen, Tse, Lisa

arXiv.org Machine LearningAug-16-2020

We introduce a novel online multitask setting. In this setting each task is partitioned into a sequence of segments that is unknown to the learner. Associated with each segment is a hypothesis from some hypothesis class. We give algorithms that are designed to exploit the scenario where there are many such segments but significantly fewer associated hypotheses. We prove regret bounds that hold for any segmentation of the tasks and any association of hypotheses to the segments. In the single-task setting this is equivalent to switching with long-term memory in the sense of [Bousquet and Warmuth; 2003]. We provide an algorithm that predicts on each trial in time linear in the number of hypotheses when the hypothesis class is finite. We also consider infinite hypothesis classes from reproducing kernel Hilbert spaces for which we give an algorithm whose per trial time complexity is cubic in the number of cumulative trials. In the single-task special case this is the first example of an efficient regret-bounded switching algorithm with long-term memory for a non-parametric hypothesis class.

algorithm, artificial intelligence, neural network, (20 more...)

arXiv.org Machine Learning

2008.07055

Country:

North America > Canada > Quebec (0.14)
Europe > United Kingdom > England (0.14)
North America > United States > Oregon (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.63)
Overview (0.45)
Instructional Material > Online (0.40)

Industry: Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Online Learning of Facility Locations

Pasteris, Stephen, He, Ting, Vitale, Fabio, Wang, Shiqiang, Herbster, Mark

arXiv.org Machine LearningJul-6-2020

In this paper we consider an online learning version of the Facility location problem where users need to be served one at a time in a sequence of trials. The goal is to select, at each trial, a subset of a given set of sites, and then pay a loss equal to their total "opening cost" plus the minimum "connection cost" for connecting the user to one of the sites in the subset. More precisely, we are given a set of N sites. At the beginning of each trial, an opening cost and a connection cost for the arriving user are associated with each site and are unknown. At each trial, the learner has to select a subset of sites and incurs a loss given by the minimum connection cost over the selected sites plus the sum of the opening costs of all selected sites. After each subset selection, the opening and connection costs of all sites are revealed. To solve this problem, we design and rigorously analyse an algorithm which belongs to the class of online learning algorithms that make use of the Exponentiated gradient method [15]. We measure, and rigorously analyse, the performance of our method by comparing its cumulative loss with that of any fixed subset of sites.

algorithm, computer based training, educational technology, (22 more...)

arXiv.org Machine Learning

2007.02801

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.81)

Add feedback

MaxHedge: Maximising a Maximum Online

Pasteris, Stephen, Vitale, Fabio, Chan, Kevin, Wang, Shiqiang, Herbster, Mark

arXiv.org Machine LearningJun-30-2019

We introduce a new online learning framework where, at each trial, the learner is required to select a subset of actions from a given known action set. Each action is associated with an energy value, a reward and a cost. The sum of the energies of the actions selected cannot exceed a given energy budget. The goal is to maximise the cumulative profit, where the profit obtained on a single trial is defined as the difference between the maximum reward among the selected actions and the sum of their costs. Action energy values and the budget are known and fixed. All rewards and costs associated with each action change over time and are revealed at each trial only after the learner's selection of actions. Our framework encompasses several online learning problems where the environment changes over time; and the solution trades-off between minimising the costs and maximising the maximum reward of the selected subset of actions, while being constrained to an action energy budget. The algorithm that we propose is efficient and general in that it may be specialised to multiple natural online combinatorial problems.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1810.11843

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry:

Education (0.89)
Information Technology (0.68)
Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Matrix Completion with Side Information

Herbster, Mark, Pasteris, Stephen, Tse, Lisa

arXiv.org Machine LearningJun-17-2019

We give an online algorithm and prove novel mistake and regret bounds for online binary matrix completion with side information. The bounds we prove are of the form $\tilde{\mathcal{O}}({\mathcal{D}}/{\gamma^2})$. The term ${1}/{\gamma^2}$ is analogous to the usual margin term in SVM (perceptron) bounds. More specifically, if we assume that there is some factorization of the underlying $m\times n$ matrix into $\mathbf{P} \mathbf{Q}^{\top}$ where the rows of $\mathbf{P}$ are interpreted as ``classifiers'' in $\Re^d$ and the rows of $\mathbf{Q}$ as ``instances'' in $\Re^d$, then $\gamma$ is is the maximum (normalized) margin over all factorizations $\mathbf{P} \mathbf{Q}^{\top}$ consistent with the observed matrix. The quasi-dimension term $\mathcal{D}$ measures the quality of side information. In the presence of no side information, $\mathcal{D} = m+n$. However, if the side information is predictive of the underlying factorization of the matrix, then in the best case, $\mathcal{D} \in \mathcal{O}(k + \ell)$ where $k$ is the number of distinct row factors and $\ell$ is the number of distinct column factors. We additionally provide a generalization of our algorithm to the inductive setting. In this setting, the side information is not specified in advance. The results are similar to the transductive setting but in the best case, the quasi-dimension $\mathcal{D}$ is now bounded by $\mathcal{O}(k^2 + \ell^2)$.

artificial intelligence, machine learning, side information, (15 more...)

arXiv.org Machine Learning

1906.07255

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Predicting Switching Graph Labelings with Cluster Specialists

Herbster, Mark, Robinson, James

arXiv.org Machine LearningJun-17-2018

We address the problem of predicting the labeling of a graph in an online setting when the labeling is changing over time. We provide three mistake-bounded algorithms based on three paradigmatic methods for online algorithm design. The algorithm with the strongest guarantee is a quasi-Bayesian classifier which requires $\mathcal{O}(t \log n)$ time to predict at trial $t$ on an $n$-vertex graph. The fastest algorithm (with the weakest guarantee) is based on a specialist [10] approach and surprisingly only requires $\mathcal{O}(\log n)$ time on any trial $t$. We also give an algorithm based on a kernelized Perceptron with an intermediate per-trial time complexity of $\mathcal{O}(n)$ and a mistake bound which is not strictly comparable. Finally, we provide experiments on simulated data comparing these methods.

algorithm, artificial intelligence, neural network, (19 more...)

arXiv.org Machine Learning

1806.06439

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.35)

Add feedback

Quantum machine learning: a classical perspective

Ciliberto, Carlo, Herbster, Mark, Ialongo, Alessandro Davide, Pontil, Massimiliano, Rocchetto, Andrea, Severini, Simone, Wossnig, Leonard

arXiv.org Machine LearningFeb-13-2018

Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.

algorithm, constraint-based reasoning, deep learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1098/rspa.2017.0551

1707.08561

Country:

North America > United States (0.67)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.88)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
(3 more...)

Add feedback