AITopics | Lin, Xue

Collaborating Authors

Lin, Xue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

Wang, Siyue, Wang, Xiao, Zhao, Pu, Wen, Wujie, Kaeli, David, Chin, Peter, Lin, Xue

arXiv.org Machine LearningSep-13-2018

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. This work provides a solution to hardening DNNs under adversarial attacks through defensive dropout. Besides using dropout during training for the best test accuracy, we propose to use dropout also at test time to achieve strong defense effects. We consider the problem of building robust DNNs as an attacker-defender two-player game, where the attacker and the defender know each others' strategies and try to optimize their own strategies towards an equilibrium. Based on the observations of the effect of test dropout rate on test accuracy and attack success rate, we propose a defensive dropout algorithm to determine an optimal test dropout rate given the neural network model and the attacker's strategy for generating adversarial examples.We also investigate the mechanism behind the outstanding defense effects achieved by the proposed defensive dropout. Comparing with stochastic activation pruning (SAP), another defense method through introducing randomness into the DNN model, we find that our defensive dropout achieves much larger variances of the gradients, which is the key for the improved defense effects (much lower attack success rate). For example, our defensive dropout can reduce the attack success rate from 100% to 13.89% under the currently strongest attack i.e., C&W attack on MNIST dataset.

deep learning, dropout rate, neural network, (19 more...)

arXiv.org Machine Learning

doi: 10.1145/3240765.3264699

1809.05165

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Government > Military (1.00)
Information Technology > Security & Privacy (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

Xu, Kaidi, Liu, Sijia, Zhao, Pu, Chen, Pin-Yu, Zhang, Huan, Erdogmus, Deniz, Wang, Yanzhi, Lin, Xue

arXiv.org Machine LearningAug-5-2018

When generating adversarial examples to attack deep neural networks (DNNs), $\ell_p$ norm of the added perturbation is usually used to measure the similarity between original image and adversarial example. However, such adversarial attacks may fail to capture key infomation hidden in the input. This work develops a more general attack model i.e., the structured attack that explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key structures. An ADMM (alternating direction method of multipliers)-based framework is proposed that can split the original problem into a sequence of analytically solvable subproblems and can be generalized to implement other state-of-the-art attacks. Strong group sparsity is achieved in adversarial perturbations even with the same level of distortion in terms of $\ell_p$ norm as the state-of-the-art attacks. Extensive experimental results on MNIST, CIFAR-10 and ImageNet show that our attack could be much stronger (in terms of smaller $\ell_0$ distortion) than the existing ones, and its better interpretability from group sparse structures aids in uncovering the origins of adversarial examples.

adversarial example, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1808.01664

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

Zhao, Pu, Liu, Sijia, Wang, Yanzhi, Lin, Xue

arXiv.org Machine LearningApr-9-2018

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. In a successful adversarial attack, the targeted mis-classification should be achieved with the minimal distortion added. In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively. However, there lacks a versatile framework for all types of adversarial attacks. This work for the first time unifies the methods of generating adversarial examples by leveraging ADMM (Alternating Direction Method of Multipliers), an operator splitting optimization approach, such that L0, L1, L2, and L infinity attacks can be effectively implemented by this general framework with little modifications. Comparing with the state-of-the-art attacks in each category, our ADMM-based attacks are so far the strongest, achieving both the 100% attack success rate and the minimal distortion.

adversarial example, deep learning, neural network, (16 more...)

arXiv.org Machine Learning

1804.03193

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Universal Approximation Property and Equivalence of Stochastic Computing-based Neural Networks and Binary Neural Networks

Wang, Yanzhi, Zhan, Zheng, Li, Jiayu, Tang, Jian, Yuan, Bo, Zhao, Liang, Wen, Wujie, Wang, Siyue, Lin, Xue

arXiv.org Machine LearningMar-14-2018

Large-scale deep neural networks are both memory intensive and computation-intensive, thereby posing stringent requirements on the computing platforms. Hardware accelerations of deep neural networks have been extensively investigated in both industry and academia. Specific forms of binary neural networks (BNNs) and stochastic computing based neural networks (SCNNs) are particularly appealing to hardware implementations since they can be implemented almost entirely with binary operations. Despite the obvious advantages in hardware implementation, these approximate computing techniques are questioned by researchers in terms of accuracy and universal applicability. Also it is important to understand the relative pros and cons of SCNNs and BNNs in theory and in actual hardware implementations. In order to address these concerns, in this paper we prove that the "ideal" SCNNs and BNNs satisfy the universal approximation property with probability 1 (due to the stochastic behavior). The proof is conducted by first proving the property for SCNNs from the strong law of large numbers, and then using SCNNs as a "bridge" to prove for BNNs. Based on the universal approximation property, we further prove that SCNNs and BNNs exhibit the same energy complexity. In other words, they have the same asymptotic energy consumption with the growing of network size. We also provide a detailed analysis of the pros and cons of SCNNs and BNNs for hardware implementations and conclude that SCNNs are more suitable for hardware.

artificial intelligence, deep learning, neural network, (12 more...)

arXiv.org Machine Learning

1803.05391

Country: Europe (0.14)

Genre: Research Report (0.82)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

Wang, Yanzhi, Ding, Caiwen, Li, Zhe, Yuan, Geng, Liao, Siyu, Ma, Xiaolong, Yuan, Bo, Qian, Xuehai, Tang, Jian, Qiu, Qinru, Lin, Xue

arXiv.org Machine LearningFeb-18-2018

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim of this paper is to achieve ultra-high energy efficiency and performance for hardware implementations of deep neural networks (DNNs). An algorithm-hardware co-optimization framework is developed, which is applicable to different DNN types, sizes, and application scenarios. The algorithm part adopts the general block-circulant matrices to achieve a fine-grained tradeoff between accuracy and compression ratio. It applies to both fully-connected and convolutional layers and contains a mathematically rigorous proof of the effectiveness of the method. The proposed algorithm reduces computational complexity per layer from O($n^2$) to O($n\log n$) and storage complexity from O($n^2$) to O($n$), both for training and inference. The hardware part consists of highly efficient Field Programmable Gate Array (FPGA)-based implementations using effective reconfiguration, batch processing, deep pipelining, resource re-using, and hierarchical control. Experimental results demonstrate that the proposed framework achieves at least 152X speedup and 71X energy efficiency gain compared with IBM TrueNorth processor under the same test accuracy. It achieves at least 31X energy efficiency gain compared with the reference FPGA-based work.

deep learning, implementation, neural network, (19 more...)

arXiv.org Machine Learning

1802.06402

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

AAAI ConferencesFeb-8-2018

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim of this paper is to achieve ultra-high energy efficiency and performance for hardware implementations of deep neural networks (DNNs). An algorithm-hardware co-optimization framework is developed, which is applicable to different DNN types, sizes, and application scenarios. The algorithm part adopts the general block-circulant matrices to achieve a fine-grained tradeoff of accuracy and compression ratio. It applies to both fully-connected and convolutional layers and contains a mathematically rigorous proof of the effectiveness of the method. The proposed algorithm reduces computational complexity per layer from O(n 2 ) to O(n log n) and storage complexity from O(n 2 ) to O(n), both for training and inference. The hardware part consists of highly efficient Field Programmable Gate Array (FPGA)-based implementations using effective reconfiguration, batch processing, deep pipelining, resource re-using, and hierarchical control. Experimental results demonstrate that the proposed framework achieves at least 152X speedup and 71X energy efficiency gain compared with IBM TrueNorth processor under the same test accuracy. It achieves at least 31X energy efficiency gain compared with the reference FPGA-based work.

deep learning, implementation, neural network, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices

Ding, Caiwen, Liao, Siyu, Wang, Yanzhi, Li, Zhe, Liu, Ning, Zhuo, Youwei, Wang, Chao, Qian, Xuehai, Bai, Yu, Yuan, Geng, Ma, Xiaolong, Zhang, Yipeng, Tang, Jian, Qiu, Qinru, Lin, Xue, Yuan, Bo

arXiv.org Machine LearningAug-29-2017

Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning; 2) the increased training complexity; and 3) the lack of rigorous guarantee of compression ratio and inference accuracy. To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from O(n2) to O(nlogn) and the storage complexity from O(n2) to O(n), with negligible accuracy loss. Compared to other approaches, CirCNN is distinct due to its mathematical rigor: it can converge to the same effectiveness as DNNs without compression. The CirCNN architecture, a universal DNN inference engine that can be implemented on various hardware/software platforms with configurable network architecture. To demonstrate the performance and energy efficiency, we test CirCNN in FPGA, ASIC and embedded processors. Our results show that CirCNN architecture achieves very high energy efficiency and performance with a small hardware footprint. Based on the FPGA implementation and ASIC synthesis results, CirCNN achieves 6-102X energy efficiency improvements compared with the best state-of-the-art results.

deep learning, implementation, neural network, (20 more...)

arXiv.org Machine Learning

doi: 10.1145/3123939.3124552

1708.08917

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (0.47)
Health & Medicine (0.46)
Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback