Finding significant combinations of features in the presence of categorical covariates
Papaxanthos, Laetitia, Llinares-Lopez, Felipe, Bodenham, Dean, Borgwardt, Karsten
–Neural Information Processing Systems
In high-dimensional settings, where the number of features p is typically much larger than the number of samples n, methods which can systematically examine arbitrary combinations of features, a huge 2^p-dimensional space, have recently begun to be explored. However, none of the current methods is able to assess the association between feature combinations and a target variable while conditioning on a categorical covariate, in order to correct for potential confounding effects. We propose the Fast Automatic Conditional Search (FACS) algorithm, a significant discriminative itemset mining method which conditions on categorical covariates and only scales as O(k log k), where k is the number of states of the categorical covariate. Based on the Cochran-Mantel-Haenszel Test, FACS demonstrates superior speed and statistical power on simulated and real-world datasets compared to the state of the art, opening the door to numerous applications in biomedicine.
Neural Information Processing Systems
Dec-31-2016
- Country:
- Europe (0.28)
- North America > United States (0.28)
- Genre:
- Research Report > Experimental Study (0.30)
- Industry:
- Technology: