Goto

Collaborating Authors

 Perceptrons


A Bio-Inspired Chaos Sensor Model Based on the Perceptron Neural Network: Machine Learning Concept and Application for Computational Neuro-Science

arXiv.org Artificial Intelligence

The study presents a bio-inspired chaos sensor model based on the perceptron neural network for the estimation of entropy of spike train in neurodynamic systems. After training, the sensor on perceptron, having 50 neurons in the hidden layer and 1 neuron at the output, approximates the fuzzy entropy of a short time series with high accuracy, with a determination coefficient of R2 ~ 0.9. The Hindmarsh-Rose spike model was used to generate time series of spike intervals, and datasets for training and testing the perceptron. The selection of the hyperparameters of the perceptron model and the estimation of the sensor accuracy were performed using the K-block cross-validation method. Even for a hidden layer with one neuron, the model approximates the fuzzy entropy with good results and the metric R2 ~ 0.5-0.8. In a simplified model with one neuron and equal weights in the first layer, the principle of approximation is based on the linear transformation of the average value of the time series into the entropy value. An example of using the chaos sensor on spike train of action potential recordings from the L5 dorsal rootlet of rat is provided. The bio-inspired chaos sensor model based on an ensemble of neurons is able to dynamically track the chaotic behavior of a spike signal and transmit this information to other parts of the neurodynamic model for further processing. The study will be useful for specialists in the field of computational neuroscience, and also to create humanoid and animal robots, and bio-robots with limited resources.


GRU-D-Weibull: A Novel Real-Time Individualized Endpoint Prediction

arXiv.org Artificial Intelligence

Background: Accurate risk prediction models for individual level endpoint (e.g., death), or time-to-endpoint are highly desirable in clinical practice. Methods: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using electronic health records (EHRs). Experiments: We systematically evaluated the performance and showcased the clinical utility of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6,879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost(AFT), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd 0.95) at CKD4 index date, and a minimum of ~0.45 year (sd 0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd 1.1) at index date and ~0.64 years (sd 0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to a remarkably smaller and more fixed range than competing models throughout follow-up.


Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) have pioneered advancements in graph representation learning, exhibiting superior feature learning and performance over multilayer perceptrons (MLPs) when handling graph inputs. However, understanding the feature learning aspect of GNNs is still in its initial stage. This study aims to bridge this gap by investigating the role of graph convolution within the context of feature learning theory in neural networks using gradient descent training. We provide a distinct characterization of signal learning and noise memorization in two-layer graph convolutional networks (GCNs), contrasting them with two-layer convolutional neural networks (CNNs). Our findings reveal that graph convolution significantly augments the benign overfitting regime over the counterpart CNNs, where signal learning surpasses noise memorization, by approximately factor $\sqrt{D}^{q-2}$, with $D$ denoting a node's expected degree and $q$ being the power of the ReLU activation function where $q > 2$. These findings highlight a substantial discrepancy between GNNs and MLPs in terms of feature learning and generalization capacity after gradient descent training, a conclusion further substantiated by our empirical simulations.


Application of Artificial Neural Networks for Investigation of Pressure Filtration Performance, a Zinc Leaching Filter Cake Moisture Modeling

arXiv.org Artificial Intelligence

Machine Learning (ML) is a powerful tool for material science applications. Artificial Neural Network (ANN) is a machine learning technique that can provide high prediction accuracy. This study aimed to develop an ANN model to predict the cake moisture of the pressure filtration process of zinc production. The cake moisture was influenced by seven parameters: temperature (35 and 65 Celsius), solid concentration (0.2 and 0.38 g/L), pH (2, 3.5, and 5), air-blow time (2, 10, and 15 min), cake thickness (14, 20, 26, and 34 mm), pressure, and filtration time. The study conducted 288 tests using two types of fabrics: polypropylene (S1) and polyester (S2). The ANN model was evaluated by the Coefficient of determination (R2), the Mean Square Error (MSE), and the Mean Absolute Error (MAE) metrics for both datasets. The results showed R2 values of 0.88 and 0.83, MSE values of 6.243x10-07 and 1.086x10-06, and MAE values of 0.00056 and 0.00088 for S1 and S2, respectively. These results indicated that the ANN model could predict the cake moisture of pressure filtration in the zinc leaching process with high accuracy.


Hard Sample Mining Enabled Supervised Contrastive Feature Learning for Wind Turbine Pitch System Fault Diagnosis

arXiv.org Artificial Intelligence

The efficient utilization of wind power by wind turbines relies on the ability of their pitch systems to adjust blade pitch angles in response to varying wind speeds. However, the presence of multiple health conditions in the pitch system due to the long-term wear and tear poses challenges in accurately classifying them, thus increasing the maintenance cost of wind turbines or even damaging them. This paper proposes a novel method based on hard sample mining-enabled supervised contrastive learning (HSMSCL) to address this problem. The proposed method employs cosine similarity to identify hard samples and subsequently, leverages supervised contrastive learning to learn more discriminative representations by constructing hard sample pairs. Furthermore, the hard sample mining framework in the proposed method also constructs hard samples with learned representations to make the training process of the multilayer perceptron (MLP) more challenging and make it a more effective classifier. The proposed approach progressively improves the fault diagnosis model by introducing hard samples in the SCL and MLP phases, thus enhancing its performance in complex multi-class fault diagnosis tasks. To evaluate the effectiveness of the proposed method, two real datasets comprising wind turbine pitch system cog belt fracture data are utilized. The fault diagnosis performance of the proposed method is compared against existing methods, and the results demonstrate its superior performance. The proposed approach exhibits significant improvements in fault diagnosis performance, providing promising prospects for enhancing the reliability and efficiency of wind turbine pitch system fault diagnosis.


Functional Neural Networks: Shift invariant models for functional data with applications to EEG classification

arXiv.org Artificial Intelligence

It is desirable for statistical models to detect signals of interest independently of their position. If the data is generated by some smooth process, this additional structure should be taken into account. We introduce a new class of neural networks that are shift invariant and preserve smoothness of the data: functional neural networks (FNNs). For this, we use methods from functional data analysis (FDA) to extend multi-layer perceptrons and convolutional neural networks to functional data. We propose different model architectures, show that the models outperform a benchmark model from FDA in terms of accuracy and successfully use FNNs to classify electroencephalography (EEG) data.


Spatial Gated Multi-Layer Perceptron for Land Use and Land Cover Mapping

arXiv.org Artificial Intelligence

Convolutional Neural Networks (CNNs) are models that are utilized extensively for the hierarchical extraction of features. Vision transformers (ViTs), through the use of a self-attention mechanism, have recently achieved superior modeling of global contextual information compared to CNNs. However, to realize their image classification strength, ViTs require substantial training datasets. Where the available training data are limited, current advanced multi-layer perceptrons (MLPs) can provide viable alternatives to both deep CNNs and ViTs. In this paper, we developed the SGU-MLP, a learning algorithm that effectively uses both MLPs and spatial gating units (SGUs) for precise land use land cover (LULC) mapping. Results illustrated the superiority of the developed SGU-MLP classification algorithm over several CNN and CNN-ViT-based models, including HybridSN, ResNet, iFormer, EfficientFormer and CoAtNet. The proposed SGU-MLP algorithm was tested through three experiments in Houston, USA, Berlin, Germany and Augsburg, Germany. The SGU-MLP classification model was found to consistently outperform the benchmark CNN and CNN-ViT-based algorithms. For example, for the Houston experiment, SGU-MLP significantly outperformed HybridSN, CoAtNet, Efficientformer, iFormer and ResNet by approximately 15%, 19%, 20%, 21%, and 25%, respectively, in terms of average accuracy. The code will be made publicly available at https://github.com/aj1365/SGUMLP


A Lightweight and Accurate Face Detection Algorithm Based on Retinaface

arXiv.org Artificial Intelligence

Face recognition is widely used in people's daily life. The face recognition mentioned in this paper is not for the recognition of individual faces, but refers to localization of faces in pictures or videos and counting of faces. The development of face detection algorithms can be divided into three phases, namely the early algorithms, the Adaptive Boosting framework [1], and the deep learning era. Early face recognition used a modular matching technique, which involves using a template image of a face to match various locations in the detection image to determine if there is a face at that location. A representative work was the algorithm proposed by Rowley (Neural network-based face detection[2]), which used a 20x20 dataset to train a Multi-layer Perceptron [3] model with good accuracy, but ran slowly. In 1997, Margineantu et al. proposed a face recognition algorithm in the AdaBoost framework. The boost algorithm is an ensemble learning algorithm based on PAC (probably approximately correct) learning theory. In 2001, Viola and Jones designed a face detection algorithm [4] It used simple Haar-like [5] features and cascaded AdaBoost classifiers to construct a detector that improved detection speed by two orders of magnitude over previous methods and maintained good accuracy. This approach is known as the VJ framework.


Fixed Inter-Neuron Covariability Induces Adversarial Robustness

arXiv.org Artificial Intelligence

The vulnerability to adversarial perturbations is a major flaw of Deep Neural Networks (DNNs) that raises question about their reliability when in real-world scenarios. On the other hand, human perception, which DNNs are supposed to emulate, is highly robust to such perturbations, indicating that there may be certain features of the human perception that make it robust but are not represented in the current class of DNNs. One such feature is that the activity of biological neurons is correlated and the structure of this correlation tends to be rather rigid over long spans of times, even if it hampers performance and learning. We hypothesize that integrating such constraints on the activations of a DNN would improve its adversarial robustness, and, to test this hypothesis, we have developed the Self-Consistent Activation (SCA) layer, which comprises of neurons whose activations are consistent with each other, as they conform to a fixed, but learned, covariability pattern. When evaluated on image and sound recognition tasks, the models with a SCA layer achieved high accuracy, and exhibited significantly greater robustness than multi-layer perceptron models to state-of-the-art Auto-PGD adversarial attacks \textit{without being trained on adversarially perturbed data


Quadruple-star systems are not always nested triples: a machine learning approach to dynamical stability

arXiv.org Artificial Intelligence

The dynamical stability of quadruple-star systems has traditionally been treated as a problem involving two `nested' triples which constitute a quadruple. In this novel study, we employed a machine learning algorithm, the multi-layer perceptron (MLP), to directly classify 2+2 and 3+1 quadruples based on their stability (or long-term boundedness). The training data sets for the classification, comprised of $5\times10^5$ quadruples each, were integrated using the highly accurate direct $N$-body code MSTAR. We also carried out a limited parameter space study of zero-inclination systems to directly compare quadruples to triples. We found that both our quadruple MLP models perform better than a `nested' triple MLP approach, which is especially significant for 3+1 quadruples. The classification accuracies for the 2+2 MLP and 3+1 MLP models are 94% and 93% respectively, while the scores for the `nested' triple approach are 88% and 66% respectively. This is a crucial implication for quadruple population synthesis studies. Our MLP models, which are very simple and almost instantaneous to implement, are available on GitHub, along with Python3 scripts to access them.