Realization of spatial sparseness by deep ReLU nets with massive data
Chui, Charles K., Lin, Shao-Bo, Zhang, Bo, Zhou, Ding-Xuan
--The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. In this paper, we aim at rigorous verification of the importance of massive data in embodying the out-performance of deep learning. T o approximate and learn spatially sparse and smooth functions, we establish a novel sampling theorem in learning theory to show the necessity of massive data. We then prove that implementing the classical empirical risk minimization on some deep nets facilitates in realization of the optimal learning rates derived in the sampling theorem. This perhaps explains why deep learning performs so well in the era of big data. With the rapid development of data mining and knowledge discovery, data of massive size are collected in various disciplines [50], including medical diagnosis, financial market analysis, computer vision, natural language processing, time series forecasting, and search engines. These massive data bring additional opportunities to discover subtle data features which cannot be reflected by data of small size while creating a crucial challenge on machine learning to develop learning schemes to realize benefits by exploring the use of massive data. Although numerous learning schemes such as distributed learning [26], localized learning [32] and sub-sampling [14] have been proposed to handle massive data, all these schemes focused on the tractability rather than the benefit of massiveness. Therefore, it remains open to explore the benefits brought from massive data and to develop feasible learning strategies for realizing these benefits. Deep learning [18], characterized by training deep neural networks (deep nets for short) to extract data features by using rich computational resources such as computational power of modern graphical processor units (GPUs) and custom processors, has made remarkable success in computer vision [23], speech recognition [24] and game theory [40], practically showing its power in tackling massive data. C.K. Chui is also associated with the Department of Statistics, Stanford University, CA 94305, USA. Shao-Bo Lin is with the Center of Intelligent Decision-making and Machine Learning, School of Management, Xi'an Jiaotong University, Xi'an, China.
Dec-16-2019
- Country:
- Asia > China
- Hong Kong (0.04)
- Shaanxi Province > Xi'an (0.44)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.24)
- Asia > China
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Diagnostic Medicine (0.34)
- Leisure & Entertainment > Games (0.34)
- Technology: