grda
- North America > United States > Rhode Island > Providence County > Providence (0.14)
- North America > United States > Missouri > Boone County > Columbia (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (7 more...)
- North America > United States > Rhode Island > Providence County > Providence (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.05)
- (9 more...)
Graph-Relational Domain Adaptation
Xu, Zihao, he, Hao, Lee, Guang-He, Wang, Yuyang, Wang, Hao
Existing domain adaptation methods tend to treat every domain equally and align them all perfectly. Such uniform alignment ignores topological structures among different domains; therefore it may be beneficial for nearby domains, but not necessarily for distant domains. In this work, we relax such uniform alignment by using a domain graph to encode domain adjacency, e.g., a graph of states in the US with each state as a domain and each edge indicating adjacency, thereby allowing domains to align flexibly based on the graph structure. We generalize the existing adversarial learning framework with a novel graph discriminator using encodingconditioned graph embeddings. Theoretical analysis shows that at equilibrium, our method recovers classic domain adaptation when the graph is a clique, and achieves non-trivial alignment for other types of graphs. Generalization of machine learning methods hinges on the assumption that training and test data follows the same distribution. Such an assumption no longer holds when one trains a model in some domains (source domains), and tests it in other domains (target domains) where data follows different distributions. Domain adaptation (DA) aims at improving performance in this setting by aligning data from the source and target domains so that a model trained in source domains can generalize better in target domains (Ben-David et al., 2010; Ganin et al., 2016; Tzeng et al., 2017; Zhang et al., 2019). Left: Traditional DA treats other (Zhao et al., 2019; Wang et al., 2020). Such heterogeneity each domain equally and enforces uniform can often be captured by a graph, where the alignment for all domains, which is equivalent domains realize the nodes, and the adjacency between to enforcing a fully connected domain two domains can be captured by an edge (see Figure 1). Right: Our method generalizes traditional For example, to capture the similarity of weather in DA to align domains according to any the US, we can construct a graph where each state is specific domain graph, e.g., a domain graph treated as a node and the physical proximity between describing adjacency among these 15 states.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Albania > Durrës County > Durrës (0.04)
- Asia > Middle East > Jordan (0.04)
Directional Pruning of Deep Neural Networks
Chao, Shih-Kang, Wang, Zhanyu, Xing, Yue, Cheng, Guang
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region. The proposed pruning method does not require retraining or the expert knowledge on the sparsity level. To overcome the computational formidability of estimating the flat directions, we propose to use a carefully tuned $\ell_1$ proximal gradient algorithm which can provably achieve the directional pruning with a small learning rate after sufficient training. The empirical results demonstrate the promising results of our solution in highly sparse regime (92% sparsity) among many existing pruning methods on the ResNet50 with the ImageNet, while using only a slightly higher wall time and memory footprint than the SGD. Using the VGG16 and the wide ResNet 28x10 on the CIFAR-10 and CIFAR-100, we demonstrate that our solution reaches the same minima valley as the SGD, and the minima found by our solution and the SGD do not deviate in directions that impact the training loss. The code that reproduces the results of this paper is available at https://github.com/donlan2710/gRDA-Optimizer/tree/master/directional_pruning.
- North America > United States > Rhode Island > Providence County > Providence (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (9 more...)