ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms

Ha, Huong, Gupta, Sunil, Rana, Santu, Venkatesh, Svetha

arXiv.org Artificial Intelligence 

This is clearly demonstrated by the performance of BALD. To be specific, the BNNs trained with BALD have accuracies ranging from 70 90%, but for the models-under-test M-FashionMNIST and M-MNIST-ES (average & bad models), the metric estimation accuracies range from 90 100% - which are much higher than the BNNs' accuracies. For our proposed method ALT-MAS, with the models-under-test M-FashionMNIST, M-MNIST-ES, the behaviours are similar to those of BALD. That is, the metric estimation accuracies are always higher than the BNNs accuracies, especially for per-class metrics. It is worth noting that, for the per-class metrics, even though the BNNs accuracies by ALT-MAS are much lower than the BNNs by BALD, but the metric estimations by ALT-MAS are much higher than by BALD. This asserts the motivation of our sampling approach, that is, the BNN only needs to accurately predict the data points that contribute to the metric estimation. On the other hand, with the good model-under-test M-MNIST, due to our data augmentation training strategy, the BNN accuracies by ALT-MAS are much higher than those of BALD, and thus, the metric estimations by ALT-MAS are also more accurate than those by BALD. Figure 2: The accuracy of the BNN, for each combination of model-under-test (M-MNIST, M-FashionMNIST, & M-MNIST-ES) and metric set. Plotting mean and standard error over 3 repetitions (Best seen in color).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found