discrimination power
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > California (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- (3 more...)
Visualizing the Emergence of Intermediate Visual Patterns in DNNs: Supplementary Material
This work was done under the supervison of Dr. Quanshi Zhang. Please see Section G for details of the dataset, and the selection of sample features and regional features. Eq. (3) of the paper, we assume that all features This section provides detailed derivations on the learning of the mixture model in Section 3.2 of the Therefore, the optimization can be derived as follows. This section provides more discussions on the quantification of knowledge points. According to Section 3.4 of the paper, a regional feature is a knowledge point if it is discriminative enough for classification, i.e.
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > California (0.04)
- (2 more...)
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > California (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- (3 more...)
PICNN: A Pathway towards Interpretable Convolutional Neural Networks
Guo, Wengang, Yang, Jiayi, Yin, Huilin, Chen, Qijun, Ye, Wei
Convolutional Neural Networks (CNNs) have exhibited great performance in discriminative feature learning for complex visual tasks. Besides discrimination power, interpretability is another important yet under-explored property for CNNs. One difficulty in the CNN interpretability is that filters and image classes are entangled. In this paper, we introduce a novel pathway to alleviate the entanglement between filters and image classes. The proposed pathway groups the filters in a late conv-layer of CNN into class-specific clusters. Clusters and classes are in a one-to-one relationship. Specifically, we use the Bernoulli sampling to generate the filter-cluster assignment matrix from a learnable filter-class correspondence matrix. To enable end-to-end optimization, we develop a novel reparameterization trick for handling the non-differentiable Bernoulli sampling. We evaluate the effectiveness of our method on ten widely used network architectures (including nine CNNs and a ViT) and five benchmark datasets. Experimental results have demonstrated that our method PICNN (the combination of standard CNNs with our proposed pathway) exhibits greater interpretability than standard CNNs while achieving higher or comparable discrimination power.
Does a Neural Network Really Encode Symbolic Concepts?
Recently, a series of studies have tried to extract interactions between input variables modeled by a DNN and define such interactions as concepts encoded by the DNN. However, strictly speaking, there still lacks a solid guarantee whether such interactions indeed represent meaningful concepts. Therefore, in this paper, we examine the trustworthiness of interaction concepts from four perspectives. Extensive empirical studies have verified that a well-trained DNN usually encodes sparse, transferable, and discriminative concepts, which is partially aligned with human intuition.
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
A Semi-Supervised Adaptive Discriminative Discretization Method Improving Discrimination Power of Regularized Naive Bayes
Wang, Shihe, Ren, Jianfeng, Bai, Ruibin
Recently, many improved naive Bayes methods have been developed with enhanced discrimination capabilities. Among them, regularized naive Bayes (RNB) produces excellent performance by balancing the discrimination power and generalization capability. Data discretization is important in naive Bayes. By grouping similar values into one interval, the data distribution could be better estimated. However, existing methods including RNB often discretize the data into too few intervals, which may result in a significant information loss. To address this problem, we propose a semi-supervised adaptive discriminative discretization framework for naive Bayes, which could better estimate the data distribution by utilizing both labeled data and unlabeled data through pseudo-labeling techniques. The proposed method also significantly reduces the information loss during discretization by utilizing an adaptive discriminative discretization scheme, and hence greatly improves the discrimination power of classifiers. The proposed RNB+, i.e., regularized naive Bayes utilizing the proposed discretization framework, is systematically evaluated on a wide range of machine-learning datasets. It significantly and consistently outperforms state-of-the-art NB classifiers.
- Asia > China > Zhejiang Province > Ningbo (0.05)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
Visualizing the Emergence of Intermediate Visual Patterns in DNNs
Li, Mingjie, Wang, Shaobo, Zhang, Quanshi
This paper proposes a method to visualize the discrimination power of intermediate-layer visual patterns encoded by a DNN. Specifically, we visualize (1) how the DNN gradually learns regional visual patterns in each intermediate layer during the training process, and (2) the effects of the DNN using non-discriminative patterns in low layers to construct disciminative patterns in middle/high layers through the forward propagation. Based on our visualization method, we can quantify knowledge points (i.e., the number of discriminative visual patterns) learned by the DNN to evaluate the representation capacity of the DNN. Furthermore, this method also provides new insights into signal-processing behaviors of existing deep-learning techniques, such as adversarial attacks and knowledge distillation.
- Asia > China > Shanghai > Shanghai (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (7 more...)
Unsupervised Learning of Neural Networks to Explain Neural Networks (extended abstract)
Zhang, Quanshi, Yang, Yu, Wu, Ying Nian
This paper presents an unsupervised method to learn a neural network, namely an explainer, to interpret a pre-trained convolutional neural network (CNN), i.e., the explainer uses interpretable visual concepts to explain features in middle conv-layers of a CNN. Given feature maps of a conv-layer of the CNN, the explainer performs like an auto-encoder, which decomposes the feature maps into object-part features. The object-part features are learned to reconstruct CNN features without much loss of information. We can consider the disentangled representations of object parts a paraphrase of CNN features, which help people understand the knowledge encoded by the CNN. More crucially, we learn the explainer via knowledge distillation without using any annotations of object parts or textures for supervision. In experiments, our method was widely used to interpret features of different benchmark CNNs, and explainers significantly boosted the feature interpretability without hurting the discrimination power of the CNNs.
- North America > United States > California > Los Angeles County > Los Angeles (0.15)
- Asia > China > Shanghai > Shanghai (0.05)
Large Scale Local Online Similarity/Distance Learning Framework based on Passive/Aggressive
Hamdan, Baida, Zabihzadeh, Davood, Reza, Monsefi
Similarity/Distance measures play a key role in many machine learning, pattern recognition, and data mining algorithms, which leads to the emergence of metric learning field. Many metric learning algorithms learn a global distance function from data that satisfy the constraints of the problem. However, in many real-world datasets that the discrimination power of features varies in the different regions of input space, a global metric is often unable to capture the complexity of the task. To address this challenge, local metric learning methods are proposed that learn multiple metrics across the different regions of input space. Some advantages of these methods are high flexibility and the ability to learn a nonlinear mapping but typically achieves at the expense of higher time requirement and overfitting problem. To overcome these challenges, this research presents an online multiple metric learning framework. Each metric in the proposed framework is composed of a global and a local component learned simultaneously. Adding a global component to a local metric efficiently reduce the problem of overfitting. The proposed framework is also scalable with both sample size and the dimension of input data. To the best of our knowledge, this is the first local online similarity/distance learning framework based on PA (Passive/Aggressive). In addition, for scalability with the dimension of input data, DRP (Dual Random Projection) is extended for local online learning in the present work. It enables our methods to be run efficiently on high-dimensional datasets, while maintains their predictive performance. The proposed framework provides a straightforward local extension to any global online similarity/distance learning algorithm based on PA.
- North America > United States (0.46)
- North America > Canada > Quebec (0.14)
Jet-Images -- Deep Learning Edition
de Oliveira, Luke, Kagan, Michael, Mackey, Lester, Nachman, Benjamin, Schwartzman, Ariel
Building on the notion of a particle physics detector as a camera and the collimated streams of high energy particles, or jets, it measures as an image, we investigate the potential of machine learning techniques based on deep learning architectures to identify highly boosted W bosons. Modern deep learning algorithms trained on jet images can out-perform standard physically-motivated feature driven approaches to jet tagging. We develop techniques for visualizing how these features are learned by the network and what additional information is used to improve performance. This interplay between physically-motivated feature driven tools and supervised learning algorithms is general and can be used to significantly increase the sensitivity to discover new particles and new forces, and gain a deeper understanding of the physics within jets.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)