Random Bits Regression: a Strong General Predictor for Big Data
Wang, Yi, Li, Yi, Xiong, Momiao, Jin, Li
We are interested in a general data - based prediction task: g iven a train ing data matrix ( TrX), a training outcome vector ( TrY) and a test data matrix ( TeX), predict test outcome vector (). In the era of big data, two practically conflicting challenges are eminent: (1) the prior knowledge on the subject (a lso known as domain specific knowledge) is largely insufficient; (2) computation and storage cost of big data is unaffordable. To meet these aforementioned challenge s, this paper is devoted to modeling large number of observations without domain specific k nowledge, using regression and classification. The methods widely used for regression and classification can be classified as: linear regression, k nearest neighbor (KNN) [1], support vector machine (SVM) [2], neural network (NN) [3, 4], extreme learning machine (ELM) [5], deep learning (DL) [6], random forest (RF) [7] and boosting (GBM) [8] among others . Each method performs well on some types of datasets but has its own limitations on others [9 - 12] . A method with reasonable performance on boarder, if not universe, datasets is highly desired .
Jan-13-2015
- Country:
- Europe (0.69)
- North America > United States
- Texas (0.28)
- Genre:
- Research Report > Experimental Study (0.47)
- Industry:
- Health & Medicine > Therapeutic Area (1.00)
- Technology: