Open Set Recognition for Random Forest

Feng, Guanchao, Desai, Dhruv, Pasquali, Stefano, Mehta, Dhagash

arXiv.org Machine Learning 

In the open-set settings, classi ers are required to not only accurately classify new instances of known In many real-world classi cation or recognition tasks, it is often classes (whose samples are observed during training) but also e ectively di cult to collect training examples that exhaust all possible classes recognize the samples from unknown classes. In a nutshell, due to, for example, incomplete knowledge during training or ever open-set classi ers are capable of making the "none of the above" changing regimes. Therefore, samples from unknown/novel classes decision with respect to known classes. This is known as open-set may be encountered in testing/deployment. In such scenarios, the recognition (OSR) [38] and has received signi cant attention in classi ers should be able to i) perform classi cation on known recent years [11, 47]. Since many learning tasks in nance are naturally classes, and at the same time, ii) identify samples from unknown classi cation tasks, for instance, company classi cations using classes. This is known as open-set recognition. Although random Global Industry Classi cation Standard (GICS), fund categorization, forest has been an extremely successful framework as a generalpurpose risk pro ling, economic scenario classi cations, etc., where often a classi cation (and regression) method, in practice, it usually new company, fund or economic scenario may not belong to any operates under the closed-set assumption and is not able to identify of the existing categories, casting these recognition tasks as OSR samples from new classes when run out of the box. In this work, we instead of traditional closed-set classi cation tasks is more appropriate.