How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets

Lu, Zhiyun, May, Avner, Liu, Kuan, Garakani, Alireza Bagheri, Guo, Dong, Bellet, Aurélien, Fan, Linxi, Collins, Michael, Kingsbury, Brian, Picheny, Michael, Sha, Fei

Jun-17-2015–arXiv.org Machine Learning

Deep neural networks (DNNs) and other types of deep learning architecture have made significant advances [3, 4]. In both well-benchmarked tasks and real-world applications, such as automatic speech recognition [21, 34, 44] and image recognition [29, 48], deep learning architectures have achieved an unprecedented level of success and have generated major impact. Arguably, the most instrumental factors contributing to their success are: (1) learning from a huge amount of training data for highly complex models with millions to billions of parameters; (2) adopting simple but effective optimization methods such as stochastic gradient descent; (3) combatting overfitting with new schemes such as dropout [23]; and (4) computing with massive parallelism on GPUs. New techniques as well as "tricks of the trade" are frequently invented and added to the toolboxes for machine learning researchers and practitioners. In stark contrast, there have been many fewer publicly known successful applications of kernel methods (such as support vector machines) to problems at a scale comparable to the speech and image recognition problems tackled by DNNs. This is a surprising chasm, noting that kernel methods have been extensively studied both theoretically and empirically for their power of modeling highly nonlinear data [43]. Moreover, the connection between kernel methods and (infinite) neural networks has also been long noted [35, 51, 11]. Nonetheless, a common misconception is that it may be difficult, if not impossible, for kernel methods to catch up with deep learning methods in addressing large-scale learning problems.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

Jun-17-2015

arXiv.org PDF

Add feedback

Country:
- Europe > France (0.04)
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - Florida > Miami-Dade County
      - Miami (0.04)
    - California > Los Angeles County
      - Los Angeles (0.28)
  - Canada > Ontario
    - Toronto (0.14)
- Asia > Middle East
  - Jordan (0.04)
  - Israel > Haifa District
    - Haifa (0.04)

Genre:
- Research Report (0.83)

Industry:
- Government > Regional Government > North America Government > United States Government (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found