In this video, I'll show you how SelectKBest uses Chi-squared test for feature selection for categorical features & target columns. We calculate Chi-square between each feature & the target & select the desired number of features with best Chi-square scores or the lowest p-values. The Chi-squared (χ2) test is used in statistics to test the independence of two events. More specifically in feature selection we use it to test whether the occurrence of a specific feature & the target are independent or not. For each feature & target combination, a corresponding high χ2 chi-square score or a low p-value indicates that the target column is dependent on the feature column.
Nov-22-2019, 11:25:55 GMT