Perceptrons
Do You Want To Know How Perceptron Algorithm works Internally
If you are new to the field of Deep Learning, I encourage you to read my previous article about Understand Deep Leaning with Simple exercise-PyTorch which will give you a precise understanding of how neural networks works in general. This article is a more deep dive into the internal working of Neuron/Perceptron which is the building block of Deep Learning Neural Networks architecture. A human brain has billions of neurons. Neurons are interconnected nerve cells in the human brain that are involved in the processing and transmitting chemical and electrical signals. Dendrites are branches that receive information from other neurons.
CU-UD: text-mining drug and chemical-protein interactions with ensembles of BERT-based models
Karabulut, Mehmet Efruz, Vijay-Shanker, K., Peng, Yifan
Identifying the relations between chemicals and proteins is an important text mining task. BioCreative VII track 1 DrugProt task aims to promote the development and evaluation of systems that can automatically detect relations between chemical compounds/drugs and genes/proteins in PubMed abstracts. In this paper, we describe our submission, which is an ensemble system, including multiple BERT-based language models. We combine the outputs of individual models using majority voting and multilayer perceptron. Our system obtained 0.7708 in precision and 0.7770 in recall, for an F1 score of 0.7739, demonstrating the effectiveness of using ensembles of BERT-based language models for automatically detecting relations between chemicals and proteins. Our code is available at https://github.com/bionlplab/drugprot_bcvii.
Detecting Fake Points of Interest from Location Data
Bashir, Syed Raza, Misic, Vojislav
The pervasiveness of GPS-enabled mobile devices and the widespread use of location-based services have resulted in the generation of massive amounts of geo-tagged data. In recent times, the data analysis now has access to more sources, including reviews, news, and images, which also raises questions about the reliability of Point-of-Interest (POI) data sources. While previous research attempted to detect fake POI data through various security mechanisms, the current work attempts to capture the fake POI data in a much simpler way. The proposed work is focused on supervised learning methods and their capability to find hidden patterns in location-based data. The ground truth labels are obtained through real-world data, and the fake data is generated using an API, so we get a dataset with both the real and fake labels on the location data. The objective is to predict the truth about a POI using the Multi-Layer Perceptron (MLP) method. In the proposed work, MLP based on data classification technique is used to classify location data accurately. The proposed method is compared with traditional classification and robust and recent deep neural methods. The results show that the proposed method is better than the baseline methods.
Singular learning of deep multilayer perceptrons for EEG-based emotion recognition
Human emotion recognition is an important issue in human-computer interactions and electroencephalograph (EEG) has been widely applied to emotion recognition due to its high reliability. In recent years, methods based deep learning technology have reached the state of art performance in EEG-based emotion recognition. However, there exist singularities in the parameter space of deep neural networks, which may dramatically slow down the training process. It is very worthy to investigate the specific influence of singularities when applying deep neural networks to EEG-based emotion recognition. In this paper, we mainly focus on this problem, and analyse the singular learning dynamics of deep multilayer perceptrons theoretically and numerically. The results can help us to design better algorithms to overcome the serious influence of singularities in deep neural networks for EEG-based emotion recognition.
Mixed-Integer Optimization with Constraint Learning
Maragno, Donato, Wiberg, Holly, Bertsimas, Dimitris, Birbil, S. Ilker, Hertog, Dick den, Fajemisin, Adejuyigbe
We establish a broad methodological foundation for mixed-integer optimization with learned constraints. We propose an end-to-end pipeline for data-driven decision making in which constraints and objectives are directly learned from data using machine learning, and the trained models are embedded in an optimization formulation. We exploit the mixed-integer optimization-representability of many machine learning methods, including linear models, decision trees, ensembles, and multi-layer perceptrons. The consideration of multiple methods allows us to capture various underlying relationships between decisions, contextual variables, and outcomes. We also characterize a decision trust region using the convex hull of the observations, to ensure credible recommendations and avoid extrapolation. We efficiently incorporate this representation using column generation and clustering. In combination with domain-driven constraints and objective terms, the embedded models and trust region define a mixed-integer optimization problem for prescription generation. We implement this framework as a Python package (OptiCL) for practitioners. We demonstrate the method in both chemotherapy optimization and World Food Programme planning. The case studies illustrate the benefit of the framework in generating high-quality prescriptions, the value added by the trust region, the incorporation of multiple machine learning methods, and the inclusion of multiple learned constraints.
Binary perceptron: efficient algorithms can find solutions in a rare well-connected cluster
Abbe, Emmanuel, Li, Shuangping, Sly, Allan
It was recently shown that almost all solutions in the symmetric binary perceptron are isolated, even at low constraint densities, suggesting that finding typical solutions is hard. In contrast, some algorithms have been shown empirically to succeed in finding solutions at low density. This phenomenon has been justified numerically by the existence of subdominant and dense connected regions of solutions, which are accessible by simple learning algorithms. In this paper, we establish formally such a phenomenon for both the symmetric and asymmetric binary perceptrons. We show that at low constraint density (equivalently for overparametrized perceptrons), there exists indeed a subdominant connected cluster of solutions with almost maximal diameter, and that an efficient multiscale majority algorithm can find solutions in such a cluster with high probability, settling in particular an open problem posed by Perkins-Xu '21. In addition, even close to the critical threshold, we show that there exist clusters of linear diameter for the symmetric perceptron, as well as for the asymmetric perceptron under additional assumptions.
Free Probability, Newton lilypads and Jacobians of neural networks
Chhaibi, Reda, Daouda, Tariq, Kahn, Ezechiel
Gradient descent during the learning process of a neural network can be subject to many instabilities. The spectral density of the Jacobian is a key component for analyzing robustness. Following the works of Pennington et al., such Jacobians are modeled using free multiplicative convolutions from Free Probability Theory. We present a reliable and very fast method for computing the associated spectral densities. This method has a controlled and proven convergence. Our technique is based on an adaptative Newton-Raphson scheme, by finding and chaining basins of attraction: the Newton algorithm finds contiguous lilypad-like basins and steps from one to the next, heading towards the objective. We demonstrate the applicability of our method by using it to assess how the learning process is affected by network depth, layer widths and initialization choices: empirically, final test losses are very correlated to our Free Probability metrics.
DASentimental: Detecting depression, anxiety and stress in texts via emotional recall, cognitive networks and machine learning
Fatima, Asra, Ying, Li, Hills, Thomas, Stella, Massimo
Most current affect scales and sentiment analysis on written text focus on quantifying valence (sentiment) -- the most primary dimension of emotion. However, emotions are broader and more complex than valence. Distinguishing negative emotions of similar valence could be important in contexts such as mental health. This project proposes a semi-supervised machine learning model (DASentimental) to extract depression, anxiety and stress from written text. First, we trained the model to spot how sequences of recalled emotion words by $N=200$ individuals correlated with their responses to the Depression Anxiety Stress Scale (DASS-21). Within the framework of cognitive network science, we model every list of recalled emotions as a walk over a networked mental representation of semantic memory, with emotions connected according to free associations in people's memory. Among several tested machine learning approaches, we find that a multilayer perceptron neural network trained on word sequences and semantic network distances can achieve state-of-art, cross-validated predictions for depression ($R = 0.7$), anxiety ($R = 0.44$) and stress ($R = 0.52$). Though limited by sample size, this first-of-its-kind approach enables quantitative explorations of key semantic dimensions behind DAS levels. We find that semantic distances between recalled emotions and the dyad "sad-happy" are crucial features for estimating depression levels but are less important for anxiety and stress. We also find that semantic distance of recalls from "fear" can boost the prediction of anxiety but it becomes redundant when the "sad-happy" dyad is considered. Adopting DASentimental as a semi-supervised learning tool to estimate DAS in text, we apply it to a dataset of 142 suicide notes. We conclude by discussing key directions for future research enabled by artificial intelligence detecting stress, anxiety and depression.
From Weighted Conditionals of Multilayer Perceptrons to Gradual Argumentation and Back
A fuzzy multipreference semantics has been recently proposed for weighted conditional knowledge bases, and used to develop a logical semantics for Multilayer Perceptrons, by regarding a deep neural network (after training) as a weighted conditional knowledge base. This semantics, in its different variants, suggests some gradual argumentation semantics, which are related to the family of the gradual semantics studied by Amgoud and Doder. The relationships between weighted conditional knowledge bases and MLPs extend to the proposed gradual semantics to capture the stationary states of MPs, in agreement with previous results on the relationship between argumentation frameworks and neural networks. The paper also suggests a simple way to extend the proposed semantics to deal attacks/supports by a boolean combination of arguments, based on the fuzzy semantics of weighted conditionals, as well as an approach for defeasible reasoning over a weighted argumentation graph, building on the proposed gradual semantics.
SVM and ANN based Classification of EMG signals by using PCA and LDA
Basak, Hritam, Roy, Alik, Lahiri, Jeet Bandhu, Bose, Sayantan, Patra, Soumyadeep
In recent decades, biomedical signals have been used for communication in Human-Computer Interfaces (HCI) for medical applications; an instance of these signals are the myoelectric signals (MES), which are generated in the muscles of the human body as unidimensional patterns. Because of this, the methods and algorithms developed for pattern recognition in signals can be applied for their analyses once these signals have been sampled and turned into electromyographic (EMG) signals. Additionally, in recent years, many researchers have dedicated their efforts to studying prosthetic control utilizing EMG signal classification, that is, by logging a set of MES in a proper range of frequencies to classify the corresponding EMG signals. The feature classification can be carried out on the time domain or by using other domains such as the frequency domain (also known as the spectral domain), time scale, and time-frequency, amongst others. One of the main methods used for pattern recognition in myoelectric signals is the Support Vector Machines (SVM) technique whose primary function is to identify an n-dimensional hyperplane to separate a set of input feature points into different classes. This technique has the potential to recognize complex patterns and on several occasions, it has proven its worth when compared to other classifiers such as Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), and Principal Component Analysis(PCA). The key concepts underlying the SVM are (a) the hyperplane separator; (b) the kernel function; (c) the optimal separation hyperplane; and (d) a soft margin (hyperplane tolerance).