Oates, Tim
cuSLINK: Single-linkage Agglomerative Clustering on the GPU
Nolet, Corey J., Gala, Divye, Fender, Alex, Doijade, Mahesh, Eaton, Joe, Raff, Edward, Zedlewski, John, Rees, Brad, Oates, Tim
In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only $O(Nk)$ space and uses a parameter $k$ to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for $k$-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision. Users can obtain cuSLINK at https://docs.rapids.ai/api/cuml/latest/api/#agglomerative-clustering
Recasting Self-Attention with Holographic Reduced Representations
Alam, Mohammad Mahmudul, Raff, Edward, Biderman, Stella, Oates, Tim, Holt, James
In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains. However, in domains with very long sequence lengths the $\mathcal{O}(T^2)$ memory and $\mathcal{O}(T^2 H)$ compute costs can make using transformers infeasible. Motivated by problems in malware detection, where sequence lengths of $T \geq 100,000$ are a roadblock to deep learning, we re-cast self-attention using the neuro-symbolic approach of Holographic Reduced Representations (HRR). In doing so we perform the same high-level strategy of the standard self-attention: a set of queries matching against a set of keys, and returning a weighted response of the values for each key. Implemented as a ``Hrrformer'' we obtain several benefits including $\mathcal{O}(T H \log H)$ time complexity, $\mathcal{O}(T H)$ space complexity, and convergence in $10\times$ fewer epochs. Nevertheless, the Hrrformer achieves near state-of-the-art accuracy on LRA benchmarks and we are able to learn with just a single layer. Combined, these benefits make our Hrrformer the first viable Transformer for such long malware classification sequences and up to $280\times$ faster to train on the Long Range Arena benchmark. Code is available at \url{https://github.com/NeuromorphicComputationResearchProgram/Hrrformer}
RFC-Net: Learning High Resolution Global Features for Medical Image Segmentation on a Computational Budget
Saha, Sourajit, Saha, Shaswati, Gani, Md Osman, Oates, Tim, Chapman, David
Learning High-Resolution representations is essential for semantic segmentation. Convolutional neural network (CNN)architectures with downstream and upstream propagation flow are popular for segmentation in medical diagnosis. However, due to performing spatial downsampling and upsampling in multiple stages, information loss is inexorable. On the contrary, connecting layers densely on high spatial resolution is computationally expensive. In this work, we devise a Loose Dense Connection Strategy to connect neurons in subsequent layers with reduced parameters. On top of that, using a m-way Tree structure for feature propagation we propose Receptive Field Chain Network (RFC-Net) that learns high resolution global features on a compressed computational space. Our experiments demonstrates that RFC-Net achieves state-of-the-art performance on Kvasir and CVC-ClinicDB benchmarks for Polyp segmentation.
Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Hossain, Khondoker Murad, Oates, Tim
The increasing importance of both deep neural networks (DNNs) and cloud services for training them means that bad actors have more incentive and opportunity to insert backdoors to alter the behavior of trained models. In this paper, we introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights using independent vector analysis (IVA) followed by a machine learning classifier. In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data, being applicable across domains, operating with a wide range of network architectures, not assuming the nature of the triggers used to change network behavior, and being highly scalable. We discuss the detection pipeline, and then demonstrate the results on two computer vision datasets regarding image classification and object detection. Our method outperforms the competing algorithms in terms of efficiency and is more accurate, helping to ensure the safe application of deep learning and AI.
Lempel-Ziv Networks
Saul, Rebecca, Alam, Mohammad Mahmudul, Hurwitz, John, Raff, Edward, Oates, Tim, Holt, James
Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences -- in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to T = 200, 000, 000 steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof of concept, we are unable to improve meaningfully on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.
Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
Alam, Mohammad Mahmudul, Raff, Edward, Oates, Tim, Holt, James
Due to the computational cost of running inference for a neural network, the need to deploy the inferential steps on a third party's compute environment or hardware is common. If the third party is not fully trusted, it is desirable to obfuscate the nature of the inputs and outputs, so that the third party can not easily determine what specific task is being performed. Provably secure protocols for leveraging an untrusted party exist but are too computational demanding to run in practice. We instead explore a different strategy of fast, heuristic security that we call Connectionist Symbolic Pseudo Secrets. By leveraging Holographic Reduced Representations (HRR), we create a neural network with a pseudo-encryption style defense that empirically shows robustness to attack, even under threat models that unrealistically favor the adversary.
Interactive Hierarchical Guidance using Language
Prakash, Bharat, Waytowich, Nicholas, Oates, Tim, Mohsenin, Tinoosh
Reinforcement learning has been successful in many tasks ranging from robotic control, games, energy management etc. In complex real world environments with sparse rewards and long task horizons, sample efficiency is still a major challenge. Most complex tasks can be easily decomposed into high-level planning and low level control. Therefore, it is important to enable agents to leverage the hierarchical structure and decompose bigger tasks into multiple smaller sub-tasks. We introduce an approach where we use language to specify sub-tasks and a high-level planner issues language commands to a low level controller. The low-level controller executes the sub-tasks based on the language commands. Our experiments show that this method is able to solve complex long horizon planning tasks with limited human supervision. Using language has added benefit of interpretability and ability for expert humans to take over the high-level planning task and provide language commands if necessary.
Learning with Holographic Reduced Representations
Ganesan, Ashwinkumar, Gao, Hang, Gandhi, Sunil, Raff, Edward, Oates, Tim, Holt, James, McLean, Mark
Holographic Reduced Representations (HRR) are a method for performing symbolic AI on top of real-valued vectors \cite{Plate1995} by associating each vector with an abstract concept, and providing mathematical operations to manipulate vectors as if they were classic symbolic objects. This method has seen little use outside of older symbolic AI work and cognitive science. Our goal is to revisit this approach to understand if it is viable for enabling a hybrid neural-symbolic approach to learning as a differentiable component of a deep learning architecture. HRRs today are not effective in a differentiable solution due to numerical instability, a problem we solve by introducing a projection step that forces the vectors to exist in a well behaved point in space. In doing so we improve the concept retrieval efficacy of HRRs by over $100\times$. Using multi-label classification we demonstrate how to leverage the symbolic HRR properties to develop an output layer and loss function that is able to learn effectively, and allows us to investigate some of the pros and cons of an HRR neuro-symbolic learning approach.
Learning a Reversible Embedding Mapping using Bi-Directional Manifold Alignment
Ganesan, Ashwinkumar, Ferraro, Francis, Oates, Tim
We propose a Bi-Directional Manifold Alignment (BDMA) that learns a non-linear mapping between two manifolds by explicitly training it to be bijective. We demonstrate BDMA by training a model for a pair of languages rather than individual, directed source and target combinations, reducing the number of models by 50%. We show that models trained with BDMA in the "forward" (source to target) direction can successfully map words in the "reverse" (target to source) direction, yielding equivalent (or better) performance to standard unidirectional translation models where the source and target language is flipped. We also show how BDMA reduces the overall size of the model.
Immigration Document Classification and Automated Response Generation
Mukherjee, Sourav, Oates, Tim, DiMascio, Vince, Jean, Huguens, Ares, Rob, Widmark, David, Harder, Jaclyn
In this paper, we consider the problem of organizing supporting documents vital to U.S. work visa petitions, as well as responding to Requests For Evidence (RFE) issued by the U.S.~Citizenship and Immigration Services (USCIS). Typically, both processes require a significant amount of repetitive manual effort. To reduce the burden of mechanical work, we apply machine learning methods to automate these processes, with humans in the loop to review and edit output for submission. In particular, we use an ensemble of image and text classifiers to categorize supporting documents. We also use a text classifier to automatically identify the types of evidence being requested in an RFE, and used the identified types in conjunction with response templates and extracted fields to assemble draft responses. Empirical results suggest that our approach achieves considerable accuracy while significantly reducing processing time.