AITopics | Learning Management

Online learning with delayed feedback has received increasing attention recently due to its several applications in distributed, web-based learning problems. In this paper we provide a systematic study of the topic, and analyze the effect of delay on the regret of online learning algorithms. Somewhat surprisingly, it turns out that delay increases the regret in a multiplicative way in adversarial problems, and in an additive way in stochastic problems. We give meta-algorithms that transform, in a black-box fashion, algorithms developed for the non-delayed case into ones that can handle the presence of delays in the feedback loop. Modifications of the well-known UCB algorithm are also developed for the bandit problem with delayed feedback, with the advantage over the meta-algorithms that they can be implemented with lower complexity.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1306.0686

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.83)

Add feedback

Online Learning with Switching Costs and Other Adaptive Adversaries

Cesa-Bianchi, Nicolo, Dekel, Ofer, Shamir, Ohad

arXiv.org Machine LearningJun-1-2013

We study the power of different types of adaptive (nonoblivious) adversaries in the setting of prediction with expert advice, under both full-information and bandit feedback. We measure the player's performance using a new notion of regret, also known as policy regret, which better captures the adversary's adaptiveness to the player's behavior. In a setting where losses are allowed to drift, we characterize ---in a nearly complete manner--- the power of adaptive adversaries with bounded memories and switching costs. In particular, we show that with switching costs, the attainable rate with bandit feedback is $\widetilde{\Theta}(T^{2/3})$. Interestingly, this rate is significantly worse than the $\Theta(\sqrt{T})$ rate attainable with switching costs in the full-information case. Via a novel reduction from experts to bandits, we also show that a bounded memory adversary can force $\widetilde{\Theta}(T^{2/3})$ regret even in the full information case, proving that switching costs are easier to control than bounded memory adversaries. Our lower bounds rely on a new stochastic adversary strategy that generates loss processes with strong dependencies.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1302.4387

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Redmond (0.04)
Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.47)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

Normalized Online Learning

Ross, Stephane, Mineiro, Paul, Langford, John

arXiv.org Machine LearningMay-28-2013

We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale. This has several useful effects: there is no need to pre-normalize data, the test-time and test-space complexity are reduced, and the algorithms are more robust.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1305.6646

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Portugal (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

Online Learning in a Contract Selection Problem

Tekin, Cem, Liu, Mingyan

arXiv.org Machine LearningMay-14-2013

In an online contract selection problem there is a seller which offers a set of contracts to sequentially arriving buyers whose types are drawn from an unknown distribution. If there exists a profitable contract for the buyer in the offered set, i.e., a contract with payoff higher than the payoff of not accepting any contracts, the buyer chooses the contract that maximizes its payoff. In this paper we consider the online contract selection problem to maximize the sellers profit. Assuming that a structural property called ordered preferences holds for the buyer's payoff function, we propose online learning algorithms that have sub-linear regret with respect to the best set of contracts given the distribution over the buyer's type. This problem has many applications including spectrum contracts, wireless service provider data plans and recommendation systems.

contract, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1305.3334

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions

Kar, Purushottam, Sriperumbudur, Bharath K, Jain, Prateek, Karnick, Harish C

arXiv.org Machine LearningMay-11-2013

In this paper, we study the generalization properties of online learning based stochastic methods for supervised learning problems where the loss function is dependent on more than one training sample (e.g., metric learning, ranking). We present a generic decoupling technique that enables us to provide Rademacher complexity-based generalization error bounds. Our bounds are in general tighter than those obtained by Wang et al (COLT 2012) for the same problem. Using our decoupling technique, we are further able to obtain fast convergence rates for strongly convex pairwise loss functions. We are also able to analyze a class of memory efficient online learning algorithms for pairwise learning problems that use only a bounded subset of past training samples to update the hypothesis at each step. Finally, in order to complement our generalization bounds, we propose a novel memory efficient online learning algorithm for higher order learning problems with bounded regret guarantees.

algorithm, artificial intelligence, machine learning, (11 more...)

arXiv.org Machine Learning

1305.2505

Country:

Asia > India > Uttar Pradesh > Kanpur (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England (0.04)
(2 more...)

Genre:

Workflow (0.88)
Research Report (0.82)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Ghosh, Shaona (University of Southampton,UK) | Lovell, Chris (University of Southampton) | Gunn, Steve R. (University of Southampton)

AAAI ConferencesMar-21-2013

In many real-life design problems, there is a requirement to simultaneously balance multiple tasks or objectives in the system that are conflicting in nature, where minimizing one objective causes another to increase in value, thereby resulting in trade-offs between the objectives. For example, in embedded multi-core mobile devices and very large scale data centers, there is a continuous problem of simultaneously balancing interfering goals of maximal power savings and minimal performance delay with varying trade-off values for different application workloads executing on them. Typically, the optimal trade-offs for the executing workloads, lie on a difficult to determine optimal Pareto front. The nature of the problem requires learning over the lifetime of the mobile device or server with continuous evaluation and prediction of the trade-off settings on the system that balances the interfering objectives optimally. Towards this, we propose an on-line learning method, where the weights of experts for addressing the objectives are updated based on a convex combination of their relative performance in addressing all objectives simultaneously. An additional importance vector that assigns relative importance to each objective at every round is used, and is sampled from a convex cone pointed at the origin Our preliminary results show that the convex combination of the importance vector and the gradient of the potential functions of the learner's regret with respect to each objective ensure that in the next round, the drift (instantaneous regret vector), is the Pareto descent direction that enables better convergence to the optimal Pareto front.

objective, on-line learning paradigm, pareto descent direction, (7 more...)

AAAI Conferences

2013 AAAI Spring Symposium Series

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.73)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.64)

Add feedback

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Ghosh, Shaona (University of Southampton,UK) | Lovell, Chris (University of Southampton) | Gunn, Steve R. (University of Southampton)

AAAI ConferencesMar-21-2013

In many real-life design problems, there is a requirement to simultaneously balance multiple tasks or objectives in the system that are conflicting in nature, where minimizing one objective causes another to increase in value, thereby resulting in trade-offs between the objectives. For example, in embedded multi-core mobile devices and very large scale data centers, there is a continuous problem of simultaneously balancing interfering goals of maximal power savings and minimal performance delay with varying trade-off values for different application workloads executing on them. Typically, the optimal trade-offs for the executing workloads, lie on a difficult to determine optimal Pareto front. The nature of the problem requires learning over the lifetime of the mobile device or server with continuous evaluation and prediction of the trade-off settings on the system that balances the interfering objectives optimally. Towards this, we propose an on-line learning method, where the weights of experts for addressing the objectives are updated based on a convex combination of their relative performance in addressing all objectives simultaneously. An additional importance vector that assigns relative importance to each objective at every round is used, and is sampled from a convex cone pointed at the origin Our preliminary results show that the convex combination of the importance vector and the gradient of the potential functions of the learner's regret with respect to each objective ensure that in the next round, the drift (instantaneous regret vector), is the Pareto descent direction that enables better convergence to the optimal Pareto front.

artificial intelligence, machine learning, on-line learning paradigm, (3 more...)

AAAI Conferences

2013 AAAI Spring Symposium Series

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.60)

Add feedback

Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

Abbasi-Yadkori, Yasin, Bartlett, Peter L., Szepesvari, Csaba

arXiv.org Machine LearningMar-12-2013

We study the problem of learning Markov decision processes with finite state and action spaces when the transition probability distributions and loss functions are chosen adversarially and are allowed to change with time. We introduce an algorithm whose regret with respect to any policy in a comparison class grows as the square root of the number of rounds of the game, provided the transition probabilities satisfy a uniform mixing condition. Our approach is efficient as long as the comparison class is polynomial and we can compute expectations over sample paths for each policy. Designing an efficient algorithm with small regret for the general case remains an open problem.

algorithm, markov decision process, transition model, (9 more...)

arXiv.org Machine Learning

1303.3055

Country:

North America > Canada > Alberta (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.42)

Add feedback

Online Learning with Pairwise Loss Functions

Wang, Yuyang, Khardon, Roni, Pechyony, Dmitry, Jones, Rosie

arXiv.org Machine LearningJan-22-2013

Efficient online learning with pairwise loss functions is a crucial component in building large-scale learning system that maximizes the area under the Receiver Operator Characteristic (ROC) curve. In this paper we investigate the generalization performance of online learning algorithms with pairwise loss functions. We show that the existing proof techniques for generalization bounds of online algorithms with a univariate loss can not be directly applied to pairwise losses. In this paper, we derive the first result providing data-dependent bounds for the average risk of the sequence of hypotheses generated by an arbitrary online learner in terms of an easily computable statistic, and show how to extract a low risk hypothesis from the sequence. We demonstrate the generality of our results by applying it to two important problems in machine learning. First, we analyze two online algorithms for bipartite ranking; one being a natural extension of the perceptron algorithm and the other using online convex optimization. Secondly, we provide an analysis for the risk bound for an online algorithm for supervised metric learning.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

1301.5332

Country:

North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Confusion-Based Online Learning and a Passive-Aggressive Scheme

Ralaivola, Liva

Neural Information Processing SystemsDec-31-2012

This paper provides the first ---to the best of our knowledge--- analysis of online learning algorithms for multiclass problems when the {\em confusion} matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and more precisely to matrix martingales. We do establish generalization bounds for online learning algorithm and show how the theoretical study motivate the proposition of a new confusion-friendly learning procedure. This learning algorithm, called \copa (for COnfusion Passive-Aggressive) is a passive-aggressive learning algorithm; it is shown that the update equations for \copa can be computed analytically, thus allowing the user from having to recours to any optimization package to implement it.

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Industry: Education > Educational Setting > Online (0.83)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)

Add feedback

Filters

Collaborating Authors

Learning Management

Online Learning under Delayed Feedback

Online Learning with Switching Costs and Other Adaptive Adversaries

Normalized Online Learning

Online Learning in a Contract Selection Problem

On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Towards Pareto Descent Directions in Sampling Experts for Multiple Tasks in an On-Line Learning Paradigm

Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

Online Learning with Pairwise Loss Functions

Confusion-Based Online Learning and a Passive-Aggressive Scheme