AITopics | Szepesvari, Csaba

Collaborating Authors

Szepesvari, Csaba

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bandits with Delayed, Aggregated Anonymous Feedback

Pike-Burke, Ciara, Agrawal, Shipra, Szepesvari, Csaba, Grunewalder, Steffen

arXiv.org Machine LearningFeb-15-2018

We study a variant of the stochastic $K$-armed bandit problem, which we call "bandits with delayed, aggregated anonymous feedback". In this problem, when the player pulls an arm, a reward is generated, however it is not immediately observed. Instead, at the end of each round the player observes only the sum of a number of previously generated rewards which happen to arrive in the given round. The rewards are stochastically delayed and due to the aggregated nature of the observations, the information of which arm led to a particular reward is lost. The question is what is the cost of the information loss due to this delayed, aggregated anonymous feedback? Previous works have studied bandits with stochastic, non-anonymous delays and found that the regret increases only by an additive factor relating to the expected delay. In this paper, we show that this additive regret increase can be maintained in the harder delayed, aggregated anonymous feedback setting when the expected delay (or a bound on it) is known. We provide an algorithm that matches the worst case regret of the non-anonymous problem exactly when the delays are bounded, and up to logarithmic factors or an additive variance term, for unbounded delays.

algorithm, artificial intelligence, big data, (19 more...)

arXiv.org Machine Learning

1709.06853

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Multi-view Matrix Factorization for Linear Dynamical System Estimation

Karami, Mahdi, White, Martha, Schuurmans, Dale, Szepesvari, Csaba

Neural Information Processing SystemsDec-31-2017

We consider maximum likelihood estimation of linear dynamical systems with generalized-linear observation models. Maximum likelihood is typically considered to be hard in this setting since latent states and transition parameters must be inferred jointly. Given that expectation-maximization does not scale and is prone to local minima, moment-matching approaches from the subspace identification literature have become standard, despite known statistical efficiency issues. In this paper, we instead reconsider likelihood maximization and develop an optimization based strategy for recovering the latent states and transition parameters. Key to the approach is a two-view reformulation of maximum likelihood estimation for linear dynamical systems that enables the use of global optimization algorithms for matrix factorization. We show that the proposed estimation strategy outperforms widely-used identification algorithms such as subspace identification methods, both in terms of accuracy and runtime.

algorithm, artificial intelligence, optimization problem, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.28)
North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Stochastic Low-Rank Bandits

Kveton, Branislav, Szepesvari, Csaba, Rao, Anup, Wen, Zheng, Abbasi-Yadkori, Yasin, Muthukrishnan, S.

arXiv.org Machine LearningDec-13-2017

Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are unobserved. We identify a class of non-negative matrices whose maximum entry can be found statistically efficiently and propose an algorithm for finding them, which we call LowRankElim. We derive a $\DeclareMathOperator{\poly}{poly} O((K + L) \poly(d) \Delta^{-1} \log n)$ upper bound on its $n$-step regret, where $K$ is the number of rows, $L$ is the number of columns, $d$ is the rank of the matrix, and $\Delta$ is the minimum gap. The bound depends on other problem-specific constants that clearly do not depend $K L$. To the best of our knowledge, this is the first such result in the literature.

artificial intelligence, big data, det 2, (20 more...)

arXiv.org Machine Learning

1712.04644

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Online Learning to Rank in Stochastic Click Models

Zoghi, Masrour, Tunys, Tomas, Ghavamzadeh, Mohammad, Kveton, Branislav, Szepesvari, Csaba, Wen, Zheng

arXiv.org Machine LearningJun-20-2017

Online learning to rank is a core problem in information retrieval and machine learning. Many provably efficient algorithms have been recently proposed for this problem in specific click models. The click model is a model of how the user interacts with a list of documents. Though these results are significant, their impact on practice is limited, because all proposed algorithms are designed for specific click models and lack convergence guarantees in other models. In this work, we propose BatchRank, the first online learning to rank algorithm for a broad class of click models. The class encompasses two most fundamental click models, the cascade and position-based models. We derive a gap-dependent upper bound on the $T$-step regret of BatchRank and evaluate it on a range of web search queries. We observe that BatchRank outperforms ranked bandits and is more robust than CascadeKL-UCB, an existing algorithm for the cascade model.

computer based training, educational technology, probability, (18 more...)

arXiv.org Machine Learning

1703.02527

Country:

North America > Canada > Alberta (0.28)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)

Add feedback

An a Priori Exponential Tail Bound for k-Folds Cross-Validation

Abou-Moustafa, Karim, Szepesvari, Csaba

arXiv.org Machine LearningJun-19-2017

We consider a priori generalization bounds developed in terms of cross-validation estimates and the stability of learners. In particular, we first derive an exponential Efron-Stein type tail inequality for the concentration of a general function of n independent random variables. Next, under some reasonable notion of stability, we use this exponential tail bound to analyze the concentration of the k-fold cross-validation (KFCV) estimate around the true risk of a hypothesis generated by a general learning rule. While the accumulated literature has often attributed this concentration to the bias and variance of the estimator, our bound attributes this concentration to the stability of the learning rule and the number of folds k. This insight raises valid concerns related to the practical use of KFCV and suggests research directions to obtain reliable empirical estimates of the actual risk.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

1706.05801

Country:

North America > Canada > Alberta (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)

Add feedback

Stochastic Rank-1 Bandits

Katariya, Sumeet, Kveton, Branislav, Szepesvari, Csaba, Vernade, Claire, Wen, Zheng

arXiv.org Machine LearningMar-8-2017

We propose stochastic rank-$1$ bandits, a class of online learning problems where at each step a learning agent chooses a pair of row and column arms, and receives the product of their values as a reward. The main challenge of the problem is that the individual values of the row and column are unobserved. We assume that these values are stochastic and drawn independently. We propose a computationally-efficient algorithm for solving our problem, which we call Rank1Elim. We derive a $O((K + L) (1 / \Delta) \log n)$ upper bound on its $n$-step regret, where $K$ is the number of rows, $L$ is the number of columns, and $\Delta$ is the minimum of the row and column gaps; under the assumption that the mean row and column rewards are bounded away from zero. To the best of our knowledge, we present the first bandit algorithm that finds the maximum entry of a rank-$1$ matrix whose regret is linear in $K + L$, $1 / \Delta$, and $\log n$. We also derive a nearly matching lower bound. Finally, we evaluate Rank1Elim empirically on multiple problems. We observe that it leverages the structure of our problems and can learn near-optimal solutions even if our modeling assumptions are mildly violated.

big data, optimization problem, rank1elim, (18 more...)

arXiv.org Machine Learning

1608.03023

Country:

North America > Canada > Alberta (0.14)
North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Industry: Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

SDP Relaxation with Randomized Rounding for Energy Disaggregation

Shaloudegi, Kiarash, György, András, Szepesvari, Csaba, Xu, Wilsun

Neural Information Processing SystemsDec-31-2016

We develop a scalable, computationally efficient method for the task of energy disaggregation for home appliance monitoring. In this problem the goal is to estimate the energy consumption of each appliance over time based on the total energy-consumption signal of a household. The current state of the art is to model the problem as inference in factorial HMMs, and use quadratic programming to find an approximate solution to the resulting quadratic integer program. Here we take a more principled approach, better suited to integer programming problems, and find an approximate optimum by combining convex semidefinite relaxations randomized rounding, as well as a scalable ADMM method that exploits the special structure of the resulting semidefinite program. Simulation results both in synthetic and real-world datasets demonstrate the superiority of our method.

appliance, artificial intelligence, optimization problem, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Spain (0.14)

Industry: Energy (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities

Huang, Ruitong, Lattimore, Tor, György, András, Szepesvari, Csaba

Neural Information Processing SystemsDec-31-2016

The follow the leader (FTL) algorithm, perhaps the simplest of all online learning algorithms, is known to perform well when the loss functions it is used on are positively curved. In this paper we ask whether there are other "lucky" settings when FTL achieves sublinear, "small" regret. In particular, we study the fundamental problem of linear prediction over a non-empty convex, compact domain. Amongst other results, we prove that the curvature of the boundary of the domain can act as if the losses were curved: In this case, we prove that as long as the mean of the loss vectors have positive lengths bounded away from zero, FTL enjoys a logarithmic growth rate of regret, while, e.g., for polyhedral domains and stochastic data it enjoys finite expected regret. Building on a previously known meta-algorithm, we also get an algorithm that simultaneously enjoys the worst-case guarantees and the bound available for FTL.

algorithm, artificial intelligence, educational setting, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.15)

Industry: Education > Educational Setting (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

Lattimore, Tor, Szepesvari, Csaba

arXiv.org Machine LearningOct-14-2016

Stochastic linear bandits are a natural and simple generalisation of finite-armed bandits with numerous practical applications. Current approaches focus on generalising existing techniques for finite-armed bandits, notably the optimism principle and Thompson sampling. While prior work has mostly been in the worst-case setting, we analyse the asymptotic instance-dependent regret and show matching upper and lower bounds on what is achievable. Surprisingly, our results show that no algorithm based on optimism or Thompson sampling will ever achieve the optimal rate, and indeed, can be arbitrarily far from optimal, even in very simple cases. This is a disturbing result because these techniques are standard tools that are widely used for sequential optimisation. For example, for generalised linear bandits and reinforcement learning.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1610.04491

Country:

North America > United States (0.28)
North America > Canada > Alberta (0.14)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms

Joulani, Pooria (University of Alberta) | Gyorgy, Andras (Imperial College London) | Szepesvari, Csaba (University of Alberta)

AAAI ConferencesApr-19-2016

We present a unified, black-box-style method for developing and analyzing online convex optimization (OCO) algorithms for full-information online learning in delayed-feedback environments. Our new, simplified analysis enables us to substantially improve upon previous work and to solve a number of open problems from the literature. Specifically, we develop and analyze asynchronous AdaGrad-style algorithms from the Follow-the-Regularized-Leader (FTRL) and Mirror-Descent family that, unlike previous works, can handle projections and adapt both to the gradients and the delays, without relying on either strong convexity or smoothness of the objective function, or data sparsity. Our unified framework builds on a natural reduction from delayed-feedback to standard (non-delayed) online learning. This reduction, together with recent unification results for OCO algorithms, allows us to analyze the regret of generic FTRL and Mirror-Descent algorithms in the delayed-feedback setting in a unified manner using standard proof techniques. In addition, the reduction is exact and can be used to obtain both upper and lower bounds on the regret in the delayed-feedback setting.

algorithm, computer based training, educational technology, (21 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Alberta (0.28)

Industry: Education > Educational Setting (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback