AITopics | Lin, Junhong

Collaborating Authors

Lin, Junhong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalization Properties of Doubly Stochastic Learning Algorithms

Lin, Junhong, Rosasco, Lorenzo

arXiv.org Machine LearningMar-8-2018

Doubly stochastic learning algorithms are scalable kernel methods that perform very well in practice. However, their generalization properties are not well understood and their analysis is challenging since the corresponding learning sequence may not be in the hypothesis space induced by the kernel. In this paper, we provide an in-depth theoretical analysis for different variants of doubly stochastic learning algorithms within the setting of nonparametric regression in a reproducing kernel Hilbert space and considering the square loss. Particularly, we derive convergence results on the generalization error for the studied algorithms either with or without an explicit penalty term. To the best of our knowledge, the derived results for the unregularized variants are the first of this kind, while the results for the regularized variants improve those in the literature. The novelties in our proof are a sample error bound that requires controlling the trace norm of a cumulative operator, and a refined analysis of bounding initial error.

algorithm, artificial intelligence, reinforcement learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jco.2018.02.004

1707.00577

Country:

Europe (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.82)

Add feedback

Optimal Convergence for Distributed Learning with Stochastic Gradient Methods and Spectral-Regularization Algorithms

Lin, Junhong, Cevher, Volkan

arXiv.org Machine LearningJan-22-2018

We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We first investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over the data. We show that optimal generalization error bounds can be retained for distributed SGM provided that the partition level is not too large. We then extend our results to spectral-regularization algorithms (SRA), including kernel ridge regression (KRR), kernel principal component analysis, and gradient methods. Our results are superior to the state-of-the-art theory. Particularly, our results show that distributed SGM has a smaller theoretical computational complexity, compared with distributed KRR and classic SGM. Moreover, even for non-distributed SRA, they provide the first optimal, capacity-dependent convergence rates, considering the case that the regression function may not be in the RKHS.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1801.07226

Country: Europe > Switzerland (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Optimal Rates for Spectral-regularized Algorithms with Least-Squares Regression over Hilbert Spaces

Lin, Junhong, Cevher, Volkan

arXiv.org Machine LearningJan-20-2018

In this paper, we study regression problems over a separable Hilbert space with the square loss, covering non-parametric regression over a reproducing kernel Hilbert space. We investigate a class of spectral-regularized algorithms, including ridge regression, principal component analysis, and gradient methods. We prove optimal, high-probability convergence results in terms of variants of norms for the studied algorithms, considering a capacity assumption on the hypothesis space and a general source condition on the target function. Consequently, we obtain almost sure convergence results with optimal rates. Our results improve and generalize previous results, filling a theoretical gap for the non-attainable cases.

artificial intelligence, machine learning, spectral-regularization algorithm, (16 more...)

arXiv.org Machine Learning

1801.0672

Country: Europe > Switzerland (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Optimal Rates for Learning with Nystr\"om Stochastic Gradient Methods

Lin, Junhong, Rosasco, Lorenzo

arXiv.org Machine LearningOct-21-2017

In the setting of nonparametric regression, we propose and study a combination of stochastic gradient methods with Nystr\"om subsampling, allowing multiple passes over the data and mini-batches. Generalization error bounds for the studied algorithm are provided. Particularly, optimal learning rates are derived considering different possible choices of the step-size, the mini-batch size, the number of iterations/passes, and the subsampling level. In comparison with state-of-the-art algorithms such as the classic stochastic gradient methods and kernel ridge regression with Nystr\"om, the studied algorithm has advantages on the computational complexity, while achieving the same optimal learning rates. Moreover, our results indicate that using mini-batches can reduce the total computational cost while achieving the same optimal statistical results.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1710.07797

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

Optimal Rates for Multi-pass Stochastic Gradient Methods

Lin, Junhong, Rosasco, Lorenzo

arXiv.org Machine LearningOct-21-2017

We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. We study how regularization properties are controlled by the step-size, the number of passes and the mini-batch size. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases. As a byproduct, we derive optimal convergence results for batch gradient methods (even in the non-attainable cases).

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

1605.08882

Country:

Europe (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Add feedback

Optimal Learning for Multi-pass Stochastic Gradient Methods

Lin, Junhong, Rosasco, Lorenzo

Neural Information Processing SystemsDec-31-2016

We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. In particular, we consider the square loss and show that for a universal step-size choice, the number of passes acts as a regularization parameter, and optimal finite sample bounds can be achieved by early-stopping. Moreover, we show that larger step-sizes are allowed when considering mini-batches. Our analysis is based on a unifying approach, encompassing both batch and stochastic gradient methods as special cases.

artificial intelligence, machine learning, sgm, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)
Europe > Italy (0.14)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback

Generalization Properties and Implicit Regularization for Multiple Passes SGM

Lin, Junhong, Camoriano, Raffaello, Rosasco, Lorenzo

arXiv.org Machine LearningMay-26-2016

We study the generalization properties of stochastic gradient methods for learning with convex loss functions and linearly parameterized functions. We show that, in the absence of penalizations or constraints, the stability and approximation properties of the algorithm can be controlled by tuning either the step-size or the number of passes over the data. In this view, these parameters can be seen to control a form of implicit regularization. Numerical results complement the theoretical findings.

artificial intelligence, assumption 1, machine learning, (17 more...)

arXiv.org Machine Learning

1605.08375

Country:

Europe (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Iterative Regularization for Learning with Convex Loss Functions

Lin, Junhong, Rosasco, Lorenzo, Zhou, Ding-Xuan

arXiv.org Machine LearningApr-1-2015

We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method. Unlike other regularization approaches, in iterative regularization no constraint or penalization is considered, and generalization is achieved by (early) stopping an empirical iteration. We consider a nonparametric setting, in the framework of reproducing kernel Hilbert spaces, and prove finite sample bounds on the excess risk under general regularity conditions. Our study provides a new class of efficient regularized learning algorithms and gives insights on the interplay between statistics and optimization in machine learning.

artificial intelligence, loss function, machine learning, (15 more...)

arXiv.org Machine Learning

1503.08985

Country:

Asia (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.34)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback