As the will to deploy neural networks models on embedded systems grows, and considering the related memory footprint and energy consumption issues, finding lighter solutions to store neural networks such as weight quantization and more efficient inference methods become major research topics. Parallel to that, adversarial machine learning has risen recently with an impressive and significant attention, unveiling some critical flaws of machine learning models, especially neural networks. In particular, perturbed inputs called adversarial examples have been shown to fool a model into making incorrect predictions. In this article, we investigate the adversarial robustness of quantized neural networks under different threat models for a classical supervised image classification task. We show that quantization does not offer any robust protection, results in severe form of gradient masking and advance some hypotheses to explain it. However, we experimentally observe poor transferability capacities which we explain by quantization value shift phenomenon and gradient misalignment and explore how these results can be exploited with an ensemble-based defense.
We propose a framework to augment this collaborative model-building with per-user domain adaptation. We show that this technique improves model accuracy for all users, using both real and synthetic data, and that this improvement is much more pronounced when differential privacy bounds are imposed on the FL model. In FL, multiple parties wish to perform essentially the same task using ML, with a model structure that is agreed upon in advance. Although the initial focus of FL has been on targeting millions of mobile devices (5), the benefits of its architecture are beneficial even for enterprise settings: the number of users of an ML service may be much smaller, but privacy concerns are paramount. Each user wants the best possible classifier for their individual use, but has a limited budget for labeling their own data.
Deep Learning based models are currently dominating most state-of-the-art solutions for disease prediction. Existing works employ RNNs along with multiple levels of attention mechanisms to provide interpretability. These deep learning models, with trainable parameters running into millions, require huge amounts of compute and data to train and deploy. These requirements are sometimes so huge that they render usage of such models as unfeasible. We address these challenges by developing a simpler yet interpretable non-deep learning based model for application to EHR data. We model and showcase our work's results on the task of predicting first occurrence of a diagnosis, often overlooked in existing works. We push the capabilities of a tree based model and come up with a strong baseline for more sophisticated models. Its performance shows an improvement over deep learning based solutions (both, with and without the first-occurrence constraint) all the while maintaining interpretability.
Low bit-width integer weights and activations are very important for efficient inference, especially with respect to lower power consumption. We propose Monte Carlo methods to quantize the weights and activations of pre-trained neural networks without any re-training. By performing importance sampling we obtain quantized low bit-width integer values from full-precision weights and activations. The precision, sparsity, and complexity are easily configurable by the amount of sampling performed. Our approach, called Monte Carlo Quantization (MCQ), is linear in both time and space, with the resulting quantized, sparse networks showing minimal accuracy loss when compared to the original full-precision networks. Our method either outperforms or achieves competitive results on multiple benchmarks compared to previous quantization methods that do require additional training.
We consider the problem of designing an artificial agent capable of interacting with humans in collaborative dialogue to produce creative, engaging narratives. In this task, the goal is to establish universe details, and to collaborate on an interesting story in that universe, through a series of natural dialogue exchanges. Our model can augment any probabilistic conversational agent by allowing it to reason about universe information established and what potential next utterances might reveal. Ideally, with each utterance, agents would reveal just enough information to add specificity and reduce ambiguity without limiting the conversation. We empirically show that our model allows control over the rate at which the agent reveals information and that doing so significantly improves accuracy in predicting the next line of dialogues from movies. We close with a case-study with four professional theatre performers, who preferred interactions with our model-augmented agent over an unaugmented agent.