Performance Analysis
Degradation Prediction of Semiconductor Lasers using Conditional Variational Autoencoder
Abdelli, Khouloud, Griesser, Helmut, Neumeyr, Christian, Hohenleitner, Robert, Pachnicke, Stephan
Semiconductor lasers have been rapidly evolving to meet the demands of next-generation optical networks. This imposes much more stringent requirements on the laser reliability, which are dominated by degradation mechanisms (e.g., sudden degradation) limiting the semiconductor laser lifetime. Physics-based approaches are often used to characterize the degradation behavior analytically, yet explicit domain knowledge and accurate mathematical models are required. Building such models can be very challenging due to a lack of a full understanding of the complex physical processes inducing the degradation under various operating conditions. To overcome the aforementioned limitations, we propose a new data-driven approach, extracting useful insights from the operational monitored data to predict the degradation trend without requiring any specific knowledge or using any physical model. The proposed approach is based on an unsupervised technique, a conditional variational autoencoder, and validated using vertical-cavity surface-emitting laser (VCSEL) and tunable edge emitting laser reliability data. The experimental results confirm that our model (i) achieves a good degradation prediction and generalization performance by yielding an F1 score of 95.3%, (ii) outperforms several baseline ML based anomaly detection techniques, and (iii) helps to shorten the aging tests by early predicting the failed devices before the end of the test and thereby saving costs
Machine Learning Workflow to Explain Black-box Models for Early Alzheimer's Disease Classification Evaluated for Multiple Datasets
Bloch, Louise, Friedrich, Christoph M.
Purpose: Hard-to-interpret Black-box Machine Learning (ML) were often used for early Alzheimer's Disease (AD) detection. Methods: To interpret eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Machine (SVM) black-box models a workflow based on Shapley values was developed. All models were trained on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and evaluated for an independent ADNI test set, as well as the external Australian Imaging and Lifestyle flagship study of Ageing (AIBL), and Open Access Series of Imaging Studies (OASIS) datasets. Shapley values were compared to intuitively interpretable Decision Trees (DTs), and Logistic Regression (LR), as well as natural and permutation feature importances. To avoid the reduction of the explanation validity caused by correlated features, forward selection and aspect consolidation were implemented. Results: Some black-box models outperformed DTs and LR. The forward-selected features correspond to brain areas previously associated with AD. Shapley values identified biologically plausible associations with moderate to strong correlations with feature importances. The most important RF features to predict AD conversion were the volume of the amygdalae, and a cognitive test score. Good cognitive test performances and large brain volumes decreased the AD risk. The models trained using cognitive test scores significantly outperformed brain volumetric models ($p<0.05$). Cognitive Normal (CN) vs. AD models were successfully transferred to external datasets. Conclusion: In comparison to previous work, improved performances for ADNI and AIBL were achieved for CN vs. Mild Cognitive Impairment (MCI) classification using brain volumes. The Shapley values and the feature importances showed moderate to strong correlations.
Can Ensemble of Classifiers Provide Better Recognition Results in Packaging Activity?
Sakib, A. H. M. Nazmus, Basak, Promit, Uddin, Syed Doha, Tasin, Shahamat Mustavi, Ahad, Md Atiqur Rahman
Skeleton-based Motion Capture (MoCap) systems have been widely used in the game and film industry for mimicking complex human actions for a long time. MoCap data has also proved its effectiveness in human activity recognition tasks. However, it is a quite challenging task for smaller datasets. The lack of such data for industrial activities further adds to the difficulties. In this work, we have proposed an ensemble-based machine learning methodology that is targeted to work better on MoCap datasets. The experiments have been performed on the MoCap data given in the Bento Packaging Activity Recognition Challenge 2021. Bento is a Japanese word that resembles lunch-box. Upon processing the raw MoCap data at first, we have achieved an astonishing accuracy of 98% on 10-fold Cross-Validation and 82% on Leave-One-Out-Cross-Validation by using the proposed ensemble model.
The Shape of Learning Curves: a Review
Learning curves provide insight into the dependence of a learner's generalization performance on the training set size. This important tool can be used for model selection, to predict the effect of more training data, and to reduce the computational complexity of model training and hyperparameter tuning. This review recounts the origins of the term, provides a formal definition of the learning curve, and briefly covers basics such as its estimation. Our main contribution is a comprehensive overview of the literature regarding the shape of learning curves. We discuss empirical and theoretical evidence that supports well-behaved curves that often have the shape of a power law or an exponential. We consider the learning curves of Gaussian processes, the complex shapes they can display, and the factors influencing them. We draw specific attention to examples of learning curves that are ill-behaved, showing worse learning performance with more training data. To wrap up, we point out various open problems that warrant deeper empirical and theoretical investigation. All in all, our review underscores that learning curves are surprisingly diverse and no universal model can be identified.
Leveraging Siamese Networks for One-Shot Intrusion Detection Model
Hindy, Hanan, Tachtatzis, Christos, Atkinson, Robert, Brosset, David, Bures, Miroslav, Andonovic, Ivan, Michie, Craig, Bellekens, Xavier
The use of supervised Machine Learning (ML) to enhance Intrusion Detection Systems has been the subject of significant research. Supervised ML is based upon learning by example, demanding significant volumes of representative instances for effective training and the need to re-train the model for every unseen cyber-attack class. However, retraining the models in-situ renders the network susceptible to attacks owing to the time-window required to acquire a sufficient volume of data. Although anomaly detection systems provide a coarse-grained defence against unseen attacks, these approaches are significantly less accurate and suffer from high false-positive rates. Here, a complementary approach referred to as 'One-Shot Learning', whereby a limited number of examples of a new attack-class is used to identify a new attack-class (out of many) is detailed. The model grants a new cyber-attack classification without retraining. A Siamese Network is trained to differentiate between classes based on pairs similarities, rather than features, allowing to identify new and previously unseen attacks. The performance of a pre-trained model to classify attack-classes based only on one example is evaluated using three datasets. Results confirm the adaptability of the model in classifying unseen attacks and the trade-off between performance and the need for distinctive class representation.
Lightweight 3D Convolutional Neural Network for Schizophrenia diagnosis using MRI Images and Ensemble Bagging Classifier
Patro, P Supriya, Goel, Tripti, VaraPrasad, S A, Tanveer, M, Murugan, R
Structural alterations have been thoroughly investigated in the brain during the early onset of schizophrenia (SCZ) with the development of neuroimaging methods. The objective of the paper is an efficient classification of SCZ in 2 different classes: Cognitive Normal (CN), and SCZ using magnetic resonance imaging (MRI) images. This paper proposed a lightweight 3D convolutional neural network (CNN) based framework for SCZ diagnosis using MRI images. In the proposed model, lightweight 3D CNN is used to extract both spatial and spectral features simultaneously from 3D volume MRI scans, and classification is done using an ensemble bagging classifier. Ensemble bagging classifier contributes to preventing overfitting, reduces variance, and improves the model's accuracy. The proposed algorithm is tested on datasets taken from three benchmark databases available as open-source: MCICShare, COBRE, and fBRINPhase-II. These datasets have undergone preprocessing steps to register all the MRI images to the standard template and reduce the artifacts. The model achieves the highest accuracy 92.22%, sensitivity 94.44%, specificity 90%, precision 90.43%, recall 94.44%, F1-score 92.39% and G-mean 92.19% as compared to the current state-of-the-art techniques. The performance metrics evidenced the use of this model to assist the clinicians for automatic accurate diagnosis of SCZ.
Learning the shape of protein micro-environments with a holographic convolutional neural network
Pun, Michael N., Ivanov, Andrew, Bellamy, Quinn, Montague, Zachary, LaMont, Colin, Bradley, Philip, Otwinowski, Jakub, Nourmohammad, Armita
Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from structure remains a major challenge. Here, we introduce Holographic Convolutional Neural Network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein function, including stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.
Design and Validation of a Portable Machine Learning-Based Electronic Nose
Volatile organic compounds (VOCs) are chemicals emitted by various groups, such as foods, bacteria, and plants. While there are specific pathways and biological features significantly related to such VOCs, detection of these is achieved mostly by human odor testing or high-end methods such as gas chromatographyโmass spectrometry that can analyze the gaseous component. However, odor characterization can be quite helpful in the rapid classification of some samples in sufficient concentrations. Lower-cost metal-oxide gas sensors have the potential to allow the same type of detection with less training required. Here, we report a portable, battery-powered electronic nose system that utilizes multiple metal-oxide gas sensors and machine learning algorithms to detect and classify VOCs. An in-house circuit was designed with ten metal-oxide sensors and voltage dividers; an STM32 microcontroller was used for data acquisition with 12-bit analog-to-digital conversion. For classification of target samples, a supervised machine learning algorithm such as support vector machine (SVM) was applied to classify the VOCs based on the measurement results. The coefficient of variation (standard deviation divided by mean) of 8 of the 10 sensors stayed below 10%, indicating the excellent repeatability of these sensors. As a proof of concept, four different types of wine samples and three different oil samples were classified, and the training model reported 100% and 98% accuracy based on the confusion matrix analysis, respectively. When the trained model was challenged against new sets of data, sensitivity and specificity of 98.5% and 98.6% were achieved for the wine test and 96.3% and 93.3% for the oil test, respectively, when the SVM classifier was used. These results suggest that the metal-oxide sensors are suitable for usage in food authentication applications.
Logits are predictive of network type
We show that it is possible to predict which deep network has generated a given logit vector with accuracy well above chance. We utilize a number of networks on a dataset, initialized with random weights or pretrained weights, as well as fine-tuned networks. A classifier is then trained on the logit vectors of the trained set of this dataset to map the logit vector to the network index that has generated it. The classifier is then evaluated on the test set of the dataset. Results are better with randomly initialized networks, but also generalize to pretrained networks as well as fine-tuned ones. Classification accuracy is higher using unnormalized logits than normalized ones. We find that there is little transfer when applying a classifier to the same networks but with different sets of weights. In addition to help better understand deep networks and the way they encode uncertainty, we anticipate our finding to be useful in some applications (e.g. tailoring an adversarial attack for a certain type of network). Code is available at https://github.com/aliborji/logits.
Forecasting User Interests Through Topic Tag Predictions in Online Health Communities
Adishesha, Amogh Subbakrishna, Jakielaszek, Lily, Azhar, Fariha, Zhang, Peixuan, Honavar, Vasant, Ma, Fenglong, Belani, Chandra, Mitra, Prasenjit, Huang, Sharon Xiaolei
The increasing reliance on online communities for healthcare information by patients and caregivers has led to the increase in the spread of misinformation, or subjective, anecdotal and inaccurate or non-specific recommendations, which, if acted on, could cause serious harm to the patients. Hence, there is an urgent need to connect users with accurate and tailored health information in a timely manner to prevent such harm. This paper proposes an innovative approach to suggesting reliable information to participants in online communities as they move through different stages in their disease or treatment. We hypothesize that patients with similar histories of disease progression or course of treatment would have similar information needs at comparable stages. Specifically, we pose the problem of predicting topic tags or keywords that describe the future information needs of users based on their profiles, traces of their online interactions within the community (past posts, replies) and the profiles and traces of online interactions of other users with similar profiles and similar traces of past interaction with the target users. The result is a variant of the collaborative information filtering or recommendation system tailored to the needs of users of online health communities. We report results of our experiments on an expert curated data set which demonstrate the superiority of the proposed approach over the state of the art baselines with respect to accurate and timely prediction of topic tags (and hence information sources of interest).