linear dependence
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Europe > Austria > Vienna (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > New York (0.04)
Multi-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms
Yunwen Lei, Urun Dogan, Alexander Binder, Marius Kloft
This paper studies the generalization performance of multi-class classification algorithms, for which we obtain--for the first time--a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis.
- North America > United States > California (0.04)
- Europe > United Kingdom (0.04)
- Asia > Singapore (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > Los Angeles County > Claremont (0.04)
Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm
We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods.
Reviews: A Sample Complexity Measure with Applications to Learning Optimal Auctions
Summary: Rademacher complexity is a powerful tool for producing generalization guarantees. The Rademacher complexity of a class H on a sample S is roughly the expected maximum gap between training and test performance if S were randomly partitioned into training and test sets, and we took the maximum over all h in H (eqn 11). This paper makes the observation that the core steps in the standard generalization bound proof will still go through if instead of taking the max over all h in H, you only look at the h's that your procedure can possibly output when given half of a double sample (Lemma 1). While unfortunately the usual final (or initial) high-probability step in this argument does not seem to go through directly, the paper shows (Theorem 2) that one can nonetheless get a useful generation bound from this using other means. The paper then shows how this generalization bound yields good sample complexity guarantees for a number of natural auction classes.
Identifying Nonstationary Causal Structures with High-Order Markov Switching Models
Balsells-Rodas, Carles, Wang, Yixin, Mediano, Pedro A. M., Li, Yingzhen
Causal discovery in time series is a rapidly evolving field with a wide variety of applications in other areas such as climate science and neuroscience. Traditional approaches assume a stationary causal graph, which can be adapted to nonstationary time series with time-dependent effects or heterogeneous noise. In this work we address nonstationarity via regime-dependent causal structures. We first establish identifiability for high-order Markov Switching Models, which provide the foundations for identifiable regime-dependent causal discovery. Our empirical studies demonstrate the scalability of our proposed approach for high-order regime-dependent structure estimation, and we illustrate its applicability on brain activity data.
- North America > United States > Michigan (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)