mfi
Opening the random forest black box by the analysis of the mutual impact of features
Voges, Lucas F., Jarren, Lukas C., Seifert, Stephan
Random forest is a popular machine learning approach for the analysis of high-dimensional data because it is flexible and provides variable importance measures for the selection of relevant features. However, the complex relationships between the features are usually not considered for the selection and thus also neglected for the characterization of the analysed samples. Here we propose two novel approaches that focus on the mutual impact of features in random forests. Mutual forest impact (MFI) is a relation parameter that evaluates the mutual association of the featurs to the outcome and, hence, goes beyond the analysis of correlation coefficients. Mutual impurity reduction (MIR) is an importance measure that combines this relation parameter with the importance of the individual features. MIR and MFI are implemented together with testing procedures that generate p-values for the selection of related and important features. Applications to various simulated data sets and the comparison to other methods for feature selection and relation analysis show that MFI and MIR are very promising to shed light on the complex relationships between features and outcome. In addition, they are not affected by common biases, e.g. that features with many possible splits or high minor allele frequencies are prefered.
Hiding task-oriented programming complexity: an industrial case study
Villagrossi, Enrico, Delledonne, Michele, Faroni, Marco, Beschi, Manuel, Pedrocchi, Nicola
The ease of use of robot programming interfaces represents a barrier to robot adoption in several manufacturing sectors because of the need for more expertise from the end-users. Current robot programming methods are mostly the past heritage, with robot programmers reluctant to adopt new programming paradigms. This work aims to evaluate the impact on non-expert users of introducing a new task-oriented programming interface that hides the complexity of a programming framework based on ROS. The paper compares the programming performance of such an interface with a classic robot-oriented programming method based on a state-of-the-art robot teach pendant. An experimental campaign involved 22 non-expert users working on the programming of two industrial tasks. Task-oriented and robot-oriented programming showed comparable learning time, programming time and the number of questions raised during the programming phases, highlighting the possibility of a smooth introduction to task-oriented programming even to non-expert users.
Feature Importance Measure for Non-linear Learning Algorithms
Vidovic, Marina M. -C., Görnitz, Nico, Müller, Klaus-Robert, Kloft, Marius
Complex problems may require sophisticated, non-linear learning methods such as kernel machines or deep neural networks to achieve state of the art prediction accuracies. However, high prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. Unfortunately, most methods do not come with out of the box straight forward interpretation. Even linear prediction functions are not straight forward to explain if features exhibit complex correlation structure. In this paper, we propose the Measure of Feature Importance (MFI). MFI is general and can be applied to any arbitrary learning machine (including kernel machines and deep learning). MFI is intrinsically non-linear and can detect features that by itself are inconspicuous and only impact the prediction function through their interaction with other features. Lastly, MFI can be used for both --- model-based feature importance and instance-based feature importance (i.e, measuring the importance of a feature for a particular data point).