Goto

Collaborating Authors

 dense neural network


Dense Neural Networks are not Universal Approximators

arXiv.org Machine Learning

We investigate the approximation capabilities of dense neural networks. While universal approximation theorems establish that sufficiently large architectures can approximate arbitrary continuous functions if there are no restrictions on the weight values, we show that dense neural networks do not possess this universality. Our argument is based on a model compression approach, combining the weak regularity lemma with an interpretation of feedforward networks as message passing graph neural networks. We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions, which model a notion of dense connectivity. Within this setting, we demonstrate the existence of Lipschitz continuous functions that cannot be approximated by such networks. This highlights intrinsic limitations of neural networks with dense layers and motivates the use of sparse connectivity as a necessary ingredient for achieving true universality.


Optimization of the quantization of dense neural networks from an exact QUBO formulation

arXiv.org Artificial Intelligence

This work introduces a post-training quantization (PTQ) method for dense neural networks via a novel ADAROUND-based QUBO formulation. Using the Frobenius distance between the theoretical output and the dequantized output (before the activation function) as the objective, an explicit QUBO whose binary variables represent the rounding choice for each weight and bias is obtained. Additionally, by exploiting the structure of the coefficient QUBO matrix, the global problem can be exactly decomposed into $n$ independent subproblems of size $f+1$, which can be efficiently solved using some heuristics such as simulated annealing. The approach is evaluated on MNIST, Fashion-MNIST, EMNIST, and CIFAR-10 across integer precisions from int8 to int1 and compared with a round-to-nearest traditional quantization methodology.


DominoSearch: Find layer-wise fine-grained N: M sparse schemes from dense neural networks - Supplementary Material

Neural Information Processing Systems

Section 2: Experimental study of a different policy with fixed N and flexible M. Section 3: Sensitivity of hyper-parameter ฮฒ In the main paper, we assume a policy with fixed M and flexible N. Furthermore, we also use a design space with N equal to a power-of-two. This is achieved by transforming the schemes of fixed M. For instance, 8:16, 4:16, 2:16 and 1:16 will be transformed as 1:2, 1:4, 1:8 and 1:16 with fixed N (1) and flexible M (2,4,8,16). Results are shown in Table 3. Figure 1 and 2 illustrate the differences between 1:2 and 2:4 with the same dense weight matrix and sparsity (i.e. Details can be found in Section 3.4 of the main paper. It consists of more than 1.2 million training images and Each image is labelled as one of 1K classes.


Software defined demodulation of multiple frequency shift keying with dense neural network for weak signal communications

arXiv.org Artificial Intelligence

In this paper we present the symbol and bit error rate performance of the weak signal digital communications system. We investigate orthogonal multiple frequency shift keying modulation scheme with supervised machine learning demodulation approach using simple dense end-to-end artificial neural network. We focus on the interference immunity over an additive white Gaussian noise with average signal-to-noise ratios from -20 dB to 0 dB.


DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Neural Information Processing Systems

Neural pruning is a widely-used compression technique for Deep Neural Networks (DNNs). However, the existing N:M algorithms only address the challenge of how to train N:M sparse neural networks in a uniform fashion (i.e. To tackle this problem, we present a novel technique -- \textbf{\textit{DominoSearch}} to find mixed N:M sparsity schemes from pre-trained dense deep neural networks to achieve higher accuracy than the uniform-sparsity scheme with equivalent complexity constraints (e.g. For instance, for the same model size with 2.1M parameters (87.5\% sparsity), our layer-wise N:M sparse ResNet18 outperforms its uniform counterpart by 2.1\% top-1 accuracy, on the large-scale ImageNet dataset. For the same computational complexity of 227M FLOPs, our layer-wise sparse ResNet18 outperforms the uniform one by 1.3\% top-1 accuracy.


Stroke Prediction using Clinical and Social Features in Machine Learning

arXiv.org Artificial Intelligence

Every year in the United States, 800,000 individuals suffer a stroke - one person every 40 seconds, with a death occurring every four minutes. While individual factors vary, certain predictors are more prevalent in determining stroke risk. As strokes are the second leading cause of death and disability worldwide, predicting stroke likelihood based on lifestyle factors is crucial. Showing individuals their stroke risk could motivate lifestyle changes, and machine learning offers solutions to this prediction challenge. Neural networks excel at predicting outcomes based on training features like lifestyle factors, however, they're not the only option. Logistic regression models can also effectively compute the likelihood of binary outcomes based on independent variables, making them well-suited for stroke prediction. This analysis will compare both neural networks (dense and convolutional) and logistic regression models for stroke prediction, examining their pros, cons, and differences to develop the most effective predictor that minimizes false negatives.


Diagnosis of Malignant Lymphoma Cancer Using Hybrid Optimized Techniques Based on Dense Neural Networks

arXiv.org Artificial Intelligence

Lymphoma diagnosis, particularly distinguishing between subtypes, is critical for effective treatment but remains challenging due to the subtle morphological differences in histopathological images. This study presents a novel hybrid deep learning framework that combines DenseNet201 for feature extraction with a Dense Neural Network (DNN) for classification, optimized using the Harris Hawks Optimization (HHO) algorithm. The model was trained on a dataset of 15,000 biopsy images, spanning three lymphoma subtypes: Chronic Lymphocytic Leukemia (CLL), Follicular Lymphoma (FL), and Mantle Cell Lymphoma (MCL). Our approach achieved a testing accuracy of 99.33\%, demonstrating significant improvements in both accuracy and model interpretability. Comprehensive evaluation using precision, recall, F1-score, and ROC-AUC underscores the model's robustness and potential for clinical adoption. This framework offers a scalable solution for improving diagnostic accuracy and efficiency in oncology.


To prune or not to prune : A chaos-causality approach to principled pruning of dense neural networks

arXiv.org Artificial Intelligence

Reducing the size of a neural network (pruning) by removing weights without impacting its performance is an important problem for resource-constrained devices. In the past, pruning was typically accomplished by ranking or penalizing weights based on criteria like magnitude and removing low-ranked weights before retraining the remaining ones. Pruning strategies may also involve removing neurons from the network in order to achieve the desired reduction in network size. We formulate pruning as an optimization problem with the objective of minimizing misclassifications by selecting specific weights. To accomplish this, we have introduced the concept of chaos in learning (Lyapunov exponents) via weight updates and exploiting causality to identify the causal weights responsible for misclassification. Such a pruned network maintains the original performance and retains feature explainability.


Application of Tensor Neural Networks to Pricing Bermudan Swaptions

arXiv.org Artificial Intelligence

The Cheyette model is a quasi-Gaussian volatility interest rate model widely used to price interest rate derivatives such as European and Bermudan Swaptions for which Monte Carlo simulation has become the industry standard. In low dimensions, these approaches provide accurate and robust prices for European Swaptions but, even in this computationally simple setting, they are known to underestimate the value of Bermudan Swaptions when using the state variables as regressors. This is mainly due to the use of a finite number of predetermined basis functions in the regression. Moreover, in high-dimensional settings, these approaches succumb to the Curse of Dimensionality. To address these issues, Deep-learning techniques have been used to solve the backward Stochastic Differential Equation associated with the value process for European and Bermudan Swaptions; however, these methods are constrained by training time and memory. To overcome these limitations, we propose leveraging Tensor Neural Networks as they can provide significant parameter savings while attaining the same accuracy as classical Dense Neural Networks. In this paper we rigorously benchmark the performance of Tensor Neural Networks and Dense Neural Networks for pricing European and Bermudan Swaptions, and we show that Tensor Neural Networks can be trained faster than Dense Neural Networks and provide more accurate and robust prices than their Dense counterparts.


Convolutional versus Dense Neural Networks: Comparing the Two Neural Networks Performance in Predicting Building Operational Energy Use Based on the Building Shape

arXiv.org Artificial Intelligence

A building self-shading shape impacts substantially on the amount of direct sunlight received by the building and contributes significantly to building operational energy use, in addition to other major contributing variables, such as materials and window-to-wall ratios. Deep Learning has the potential to assist designers and engineers by efficiently predicting building energy performance. This paper assesses the applicability of two different neural networks structures, Dense Neural Network (DNN) and Convolutional Neural Network (CNN), for predicting building operational energy use with respect to building shape. The comparison between the two neural networks shows that the DNN model surpasses the CNN model in performance, simplicity, and computation time. However, image-based CNN has the benefit of utilizing architectural graphics that facilitates design communication.