rnp
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China (0.04)
5ad742cd15633b26fdce1b80f7b39f7c-AuthorFeedback.pdf
We thank all the reviewers. Comments 1 & 4: It turns out that RNP (The baseline in "Rationalizing Nueral Predictions" proposed by Lei et. In fact, even the original RNP suffers from the degeneration problem. The problem primarily results from the collaborative nature of the RNP framework. This is another major advantage of CAR, which we did not have enough space to uncover in the paper.
Runtime Neural Pruning
Ji Lin, Yongming Rao, Jiwen Lu, Jie Zhou
In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China (0.04)
R\'enyi Neural Processes
Wang, Xuesong, Zhao, He, Bonilla, Edwin V.
Neural Processes (NPs) are variational frameworks that aim to represent stochastic processes with deep neural networks. Despite their obvious benefits in uncertainty estimation for complex distributions via data-driven priors, NPs enforce network parameter sharing between the conditional prior and posterior distributions, thereby risking introducing a misspecified prior. We hereby propose R\'enyi Neural Processes (RNP) to relax the influence of the misspecified prior and optimize a tighter bound of the marginal likelihood. More specifically, by replacing the standard KL divergence with the R\'enyi divergence between the posterior and the approximated prior, we ameliorate the impact of the misspecified prior via a parameter {\alpha} so that the resulting posterior focuses more on tail samples and reduce density on overconfident regions. Our experiments showed log-likelihood improvements on several existing NP families. We demonstrated the superior performance of our approach on various benchmarks including regression and image inpainting tasks. We also validate the effectiveness of RNPs on real-world tabular regression problems.
Reconstructive Neuron Pruning for Backdoor Defense
Li, Yige, Lyu, Xixiang, Ma, Xingjun, Koren, Nodens, Lyu, Lingjuan, Li, Bo, Jiang, Yu-Gang
Deep neural networks (DNNs) have been found to be vulnerable to backdoor attacks, raising security concerns about their deployment in mission-critical applications. While existing defense methods have demonstrated promising results, it is still not clear how to effectively remove backdoor-associated neurons in backdoored DNNs. In this paper, we propose a novel defense called \emph{Reconstructive Neuron Pruning} (RNP) to expose and prune backdoor neurons via an unlearning and then recovering process. Specifically, RNP first unlearns the neurons by maximizing the model's error on a small subset of clean samples and then recovers the neurons by minimizing the model's error on the same data. In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure. We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks, achieving a new state-of-the-art defense performance. Moreover, the unlearned model at the intermediate step of our RNP can be directly used to improve other backdoor defense tasks including backdoor removal, trigger recovery, backdoor label detection, and backdoor sample detection. Code is available at \url{https://github.com/bboylyg/RNP}.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Illinois (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.46)
MGR: Multi-generator Based Rationalization
Liu, Wei, Wang, Haozhao, Wang, Jun, Li, Ruixuan, Li, Xinyang, Zhang, Yuankai, Qiu, Yang
Rationalization is to employ a generator and a predictor to construct a self-explaining NLP model in which the generator selects a subset of human-intelligible pieces of the input text to the following predictor. However, rationalization suffers from two key challenges, i.e., spurious correlation and degeneration, where the predictor overfits the spurious or meaningless pieces solely selected by the not-yet well-trained generator and in turn deteriorates the generator. Although many studies have been proposed to address the two challenges, they are usually designed separately and do not take both of them into account. In this paper, we propose a simple yet effective method named MGR to simultaneously solve the two problems. The key idea of MGR is to employ multiple generators such that the occurrence stability of real pieces is improved and more meaningful pieces are delivered to the predictor. Empirically, we show that MGR improves the F1 score by up to 20.9% as compared to state-of-the-art methods. Codes are available at https://github.com/jugechengzi/Rationalization-MGR .
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (10 more...)
FR: Folded Rationalization with a Unified Encoder
Liu, Wei, Wang, Haozhao, Wang, Jun, Li, Ruixuan, Yue, Chao, Zhang, Yuankai
Conventional works generally employ a two-phase model in which a generator selects the most important pieces, followed by a predictor that makes predictions based on the selected pieces. However, such a two-phase model may incur the degeneration problem where the predictor overfits to the noise generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. To tackle this challenge, we propose Folded Rationalization (FR) that folds the two phases of the rationale model into one from the perspective of text semantic extraction. The key idea of FR is to employ a unified encoder between the generator and predictor, based on which FR can facilitate a better predictor by access to valuable information blocked by the generator in the traditional two-phase model and thus bring a better generator. Empirically, we show that FR improves the F1 score by up to 10.3% as compared to state-of-the-art methods. Our codes are available at https://github.com/jugechengzi/FR.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
- Asia > China > Hubei Province (0.04)
Recurrent Neural Processes
Willi, Timon, Masci, Jonathan, Schmidhuber, Jürgen, Osendorfer, Christian
We extend Neural Processes (NPs) to sequential data through Recurrent NPs or RNPs, a family of conditional state space models. RNPs can learn dynamical patterns from sequential data and deal with non-stationarity. Given time series observed on fast real-world time scales but containing slow long-term variabilities, RNPs may derive appropriate slow latent time scales. They do so in an efficient manner by establishing conditional independence among subsequences of the time series. Our theoretically grounded framework for stochastic processes expands the applicability of NPs while retaining their benefits of flexibility, uncertainty estimation and favourable runtime with respect to Gaussian Processes. We demonstrate that state spaces learned by RNPs benefit predictive performance on real-world time-series data and nonlinear system identification, even in the case of limited data availability.
Runtime Neural Pruning
Lin, Ji, Rao, Yongming, Lu, Jiwen, Zhou, Jie
In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime. Unlike existing neural pruning methods which produce a fixed pruned model for deployment, our method preserves the full ability of the original network and conducts pruning according to the input image and current feature maps adaptively. The pruning is performed in a bottom-up, layer-by-layer manner, which we model as a Markov decision process and use reinforcement learning for training. The agent judges the importance of each convolutional kernel and conducts channel-wise pruning conditioned on different samples, where the network is pruned more when the image is easier for the task. Since the ability of network is fully preserved, the balance point is easily adjustable according to the available resources. Our method can be applied to off-the-shelf network structures and reach a better tradeoff between speed and accuracy, especially with a large pruning rate.
Multi-Way Number Partitioning
Korf, Richard Earl (University of California, Los Angeles)
The number partitioning problem is to divide a given set of integers into a collection of subsets, so that the sum of the numbers in each subset are as nearly equal as possible. While a very efficient algorithm exists for optimal two-way partitioning, it is not nearly as effective for multi-way partitioning. We develop two new linear-space algorithms for multi-way partitioning, and demonstrate their performance on three, four, and five-way partitioning. In each case, our algorithms outperform the previous state of the art by orders of magnitude, in one case by over six orders of magnitude. Empirical analysis of the running times of our algorithms strongly suggest that their asymptotic growth is less than that of previous algorithms. The key insight behind both our new algorithms is that if an optimal k-way partition includes a particular subset, then optimally partitioning the numbers not in that set k-1 ways results in an optimal k-way partition.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > New York (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > Quebec > Montreal (0.04)