Statistical Effect Size and Python Implementation - Analytics Vidhya
Then, we calculate the ratio of the weighted sum of the squares of the differences between each category's average and overall average to the sum of squares between each value and overall average. The range of eta is between 0 and 1. A value closer to 0 indicates all categories have similar values, and any single category doesn't have more influence on variable y. A value closer to 1 indicates one or more categories have different values than other categories and have more influence on variable y. Eta can be used in EDA and data processing to know which categorical features are more important in machine learning model building.
Aug-7-2022, 18:25:51 GMT