AITopics | Learning Management

We propose a general framework for studying adaptive regret bounds in the online learning setting, subsuming model selection and data-dependent bounds. Given a data- or model-dependent bound we ask, “Does there exist some algorithm achieving this bound?” We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes.Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem.

artificial intelligence, inequality, machine learning, (14 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.63)

Add feedback

Online Learning with Gaussian Payoffs and Side Observations

Wu, Yifan, György, András, Szepesvari, Csaba

Neural Information Processing SystemsDec-31-2015

We consider a sequential learning problem with Gaussian payoffs and side information: after selecting an action $i$, the learner receives information about the payoff of every action $j$ in the form of Gaussian observations whose mean is the same as the mean payoff, but the variance depends on the pair $(i,j)$ (and may be infinite). The setup allows a more refined information transfer from one action to another than previous partial monitoring setups, including the recently introduced graph-structured feedback case. For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors).

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.57)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Add feedback

Online Learning with Adversarial Delays

Quanrud, Kent, Khashabi, Daniel

Neural Information Processing SystemsDec-31-2015

We study the performance of standard online learning algorithms when the feedback isdelayed by an adversary. We show that online-gradient-descent [1] and follow-the-perturbed-leader [2] achieve regret O( D)in the delayed setting, where D is the sum of delays of each round's feedback. This bound collapses to an optimal O( T) bound in the usual setting of no delays (where D T). Our main contribution is to show that standard algorithms for online learning already have simple regret bounds in the most general setting of delayed feedback, making adjustments to the analysis and not to the algorithms themselves. Our results help affirm and clarify the success of recent algorithms in optimization and machine learning that operate in a delayed feedback model.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.14)

Industry: Education > Educational Setting > Online (0.83)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Add feedback

Online Learning for Adversaries with Memory: Price of Past Mistakes

Anava, Oren, Hazan, Elad, Mannor, Shie

Neural Information Processing SystemsDec-31-2015

The framework of online learning with memory naturally captures learning problems with temporal effects, and was previously studied for the experts setting. In this work we extend the notion of learning with memory to the general Online Convex Optimization (OCO) framework, and present two algorithms that attain low regret. The first algorithm applies to Lipschitz continuous loss functions, obtaining optimal regret bounds for both convex and strongly convex losses. The second algorithm attains the optimal regret bounds and applies more broadly to convex losses without requiring Lipschitz continuity, yet is more complicated to implement. We complement the theoretic results with two applications: statistical arbitrage in finance, and multi-step ahead prediction in statistics.

artificial intelligence, machine learning, optimization problem, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.14)

Industry:

Education > Educational Setting > Online (0.62)
Banking & Finance > Trading (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Online learning in repeated auctions

Weed, Jonathan, Perchet, Vianney, Rigollet, Philippe

arXiv.org Machine LearningNov-18-2015

Motivated by online advertising auctions, we consider repeated Vickrey auctions where goods of unknown value are sold sequentially and bidders only learn (potentially noisy) information about a good's value once it is purchased. We adopt an online learning approach with bandit feedback to model this problem and derive bidding strategies for two models: stochastic and adversarial. In the stochastic model, the observed values of the goods are random variables centered around the true value of the good. In this case, logarithmic regret is achievable when competing against well behaved adversaries. In the adversarial model, the goods need not be identical and we simply compare our performance against that of the best fixed bid in hindsight. We show that sublinear regret is also achievable in this case and prove matching minimax lower bounds. To our knowledge, this is the first complete set of strategies for bidders participating in auctions of this type.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

1511.0572

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Marketing (0.66)
Education > Educational Setting > Online (0.61)
Information Technology > Services (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Toward Adversarial Online Learning and the Science of Deceptive Machines

Abramson, Myriam (US Naval Research Laboratory)

AAAI ConferencesNov-1-2015

Intelligent systems rely on pattern recognition and signature-based approaches for a wide range of sensors enhancing situational awareness. For example, autonomous systems depend on environmental sensors to perform their tasks and secure systems depend on anomaly detection methods. The availability of large amount of data requires the processing of data in a “streaming” fashion with online algorithms. Yet, just as online learning can enhance adaptability to a non-stationary environment, it introduces vulnerabilities that can be manipulated by adversaries to achieve their goals while evading detection. Although human intelligence might have evolved from social interactions, machine intelligence has evolved as a human intelligence artifact and been kept isolated to avoid ethical dilemmas. As our adversaries become sophisticated, it might be time to revisit this question and examine how we can combine online learning and reasoning leading to the science of deceptive and counter-deceptive machines.

adversary, algorithm, learner, (14 more...)

AAAI Conferences

2015 AAAI Fall Symposium Series

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games (0.97)
Government > Military (0.88)
Education > Educational Setting > Online (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)
(2 more...)

Add feedback

Online Learning with Gaussian Payoffs and Side Observations

Wu, Yifan, György, András, Szepesvári, Csaba

arXiv.org Machine LearningOct-27-2015

We consider a sequential learning problem with Gaussian payoffs and side information: after selecting an action $i$, the learner receives information about the payoff of every action $j$ in the form of Gaussian observations whose mean is the same as the mean payoff, but the variance depends on the pair $(i,j)$ (and may be infinite). The setup allows a more refined information transfer from one action to another than previous partial monitoring setups, including the recently introduced graph-structured feedback case. For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors).

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1510.08108

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.55)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)

Add feedback

Adaptive Online Learning

Foster, Dylan J., Rakhlin, Alexander, Sridharan, Karthik

arXiv.org Machine LearningAug-20-2015

We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds. Given a data- or model-dependent bound we ask, "Does there exist some algorithm achieving this bound?" We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes. Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second-order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem that holds for countably infinite sets.

artificial intelligence, exp, machine learning, (17 more...)

arXiv.org Machine Learning

1508.0517

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

Online Learning of k-CNF Boolean Functions

Veness, Joel (Google DeepMind) | Hutter, Marcus (Australian National University) | Orseau, Laurent (Google DeepMind) | Bellemare, Marc (Google DeepMind)

AAAI ConferencesJul-15-2015

This paper revisits the problem of learning a k-CNF Boolean function from examples, for fixed k, in the context of online learning under the logarithmic loss. We give a Bayesian interpretation to one of Valiant’s classic PAC learning algorithms, which we then build upon to derive three efficient, online, probabilistic, supervised learning algorithms for predicting the output of an unknown k-CNF Boolean function. We analyze the loss of our methods, and show that the cumulative log-loss can be upper bounded by a polynomial function of the size of each example.

algorithm, monotone conjunction, positive example, (14 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: North America > United States > Texas > Travis County > Austin (0.04)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(2 more...)

Add feedback

Online Learning to Rank for Content-Based Image Retrieval

Wan, Ji (Institute Of Computing Technology of the Chinese Academy of Sciences) | Wu, Pengcheng (Singapore Management University) | Hoi, Steven C. H. (Singapore Management University) | Zhao, Peilin (Institute for Infocomm Research) | Gao, Xingyu (Institute of Computing Technology of the Chinese Academy of Sciences) | Wang, Dayong (Michigan State University) | Zhang, Yongdong (Institute of Computing Technology of the Chinese Academy of Sciences) | Li, Jintao (Institute of Computing Technology of the Chinese Academy of Sciences)

AAAI ConferencesJul-15-2015

A major challenge in Content-Based Image Retrieval (CBIR) is to bridge the semantic gap between low-level image contents and high-level semantic concepts. Although researchers have investigated a variety of retrieval techniques using different types of features and distance functions, no single best retrieval solution can fully tackle this challenge. In a real-world CBIR task, it is often highly desired to combine multiple types of different feature representations and diverse distance measures in order to close the semantic gap. In this paper, we investigate a new framework of learning to rank for CBIR, which aims to seek the optimal combination of different retrieval schemes by learning from large-scale training data in CBIR. We first formulate the problem formally as a learning to rank task, which can be solved in general by applying the existing batch learning to rank algorithms from text information retrieval (IR). To further address the scalability towards large-scale online CBIR applications, we present a family of online learning to rank algorithms, which are significantly more efficient and scalable than classical batch algorithms for large-scale online CBIR. Finally, we conduct an extensive set of experiments, in which encouraging results show that our technique is effective, scalable and promising for large-scale CBIR.

algorithm, online, rank algorithm, (15 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: