Goto

Collaborating Authors

 confidenceinterval


Explaining Deep Learning Models -- A Bayesian Non-parametric Approach

Wenbo Guo, Sui Huang, Yunzhe Tao, Xinyu Xing, Lin Lin

Neural Information Processing Systems

While recent research hasproposed various technical approaches to provide some clues as to how an ML model makes individual predictions, they cannot provide users with an ability to inspect a model as a completeentity.



c97e7a5153badb6576d8939469f58336-Supplemental.pdf

Neural Information Processing Systems

Our initial experiments (implementation, debugging, hyperparameter tuning, etc.) required about 5000CPUhoursofcompute. Due to these rules, it is recommended to group together in order to attack simultaneously. In Warehouse[4], QTRAN makes slightly faster progress than VAST(η = 12). The results forWarehouse[16], Battle[80], and GaussianSqueeze[800] are shown in Figure 1. Figure 10: Visualizations of the generated sub-teams ofXMetaGrad with η = 14 and XSpatial with k-means clustering using 10 centroids at different stages (early, middle, late) inBattle[80] after training. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.



Appendix

Neural Information Processing Systems

We have shown experimentally that our method is effective in a variety of domains; however, other problem domains may require additional hyperparameter tuning, which can be expensive.


41a6fd31aa2e75c3c6d427db3d17ea80-Supplemental.pdf

Neural Information Processing Systems

In order to accelerate the NES search phase, we generated the pool using the weight sharing schemes proposed by Random Search with WeightSharing[37]andDARTS[39]. Specifically, we trained one-shot weight-sharing models usingeachof these two algorithms, then we sampled architectures from the weightshared models uniformly at random to build the pool.


Fork = 0,1,2,,K,summingbothsidesofthisinequalityyields R? E[ R(θ1) ] E[ R(θK+1) ] E[ R(θ0) ] 1 2 KX

Neural Information Processing Systems

Since there lacks aunified standard instoring/saving exemplars forincremental few-shot learning, we choose the setting that we consider most reasonable and practical. In our experiments, we observethat after training on base classes with balanced data, the norms ofthe class prototypes ofbase classes tend tobesimilar. However,after fine-tuning with very few data on unseen new classes, the norms of the new class prototypes are noticeably smaller than those of the base classes. The few-shot novelclasses consist ofhousehold furniture, vehicles2, flowers, and food containers (20 classes in total). The few-shot novel classes consist of people, vehicles2, flowers, and food containers (20 classes in total).