binary representative
Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, Pavel Serdyukov
This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MI-based SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives what makes our technique able to be effectively used. The superiority of our approach is demonstrated by comparison with recently proposed interactionaware filters and several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets.
Reviews: Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
The problem of designing a feature selection algorithm capable of efficiently deal with high-order interaction among features is an interesting and open problem in the feature selection area. That is why this paper is appealing. However, there are several issues regarding the computational cost and the experimental setup that need a clarification in order to consider it for acceptance. It is said (lines 219-222) that "The described technique has been inspired by the intuition that probably two binary representatives of two different features interact on average better than two binary representatives of one feature"; however, no references or examples are provided to support this idea. On the other hand, when comparing the computational cost between the algorithm with and without binary representations (lines 215-219), the same values for t and s are considered. This is not a fair comparison as both cases are not taking the same level of information.
Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MI-based SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives what makes our technique able to be effectively used. The superiority of our approach is demonstrated by comparison with recently proposed interactionaware filters and several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets.
Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
Shishkin, Alexander, Bezzubtseva, Anastasia, Drutsa, Alexey, Shishkov, Ilia, Gladkikh, Ekaterina, Gusev, Gleb, Serdyukov, Pavel
This study introduces a novel feature selection approach CMICOT, which is a further evolution of filter methods with sequential forward selection (SFS) whose scoring functions are based on conditional mutual information (MI). We state and study a novel saddle point (max-min) optimization problem to build a scoring function that is able to identify joint interactions between several features. This method fills the gap of MI-based SFS techniques with high-order dependencies. In this high-dimensional case, the estimation of MI has prohibitively high sample complexity. We mitigate this cost using a greedy approximation and binary representatives what makes our technique able to be effectively used. The superiority of our approach is demonstrated by comparison with recently proposed interaction-aware filters and several interaction-agnostic state-of-the-art ones on ten publicly available benchmark datasets.