Accuracy
Local Rose Breeds Detection System Using Transfer Learning Techniques
Farha, Amena Begum, Hakim, Md. Azizul, Khatun, Mst. Eshita
Flower breed detection and giving details of that breed with the suggestion of cultivation processes and the way of taking care is important for flower cultivation, breed invention, and the flower business. Among all the local flowers in Bangladesh, the rose is one of the most popular and demanded flowers. Roses are the most desirable flower not only in Bangladesh but also throughout the world. Roses can be used for many other purposes apart from decoration. As roses have a great demand in the flower business so rose breed detection will be very essential. However, there is no remarkable work for breed detection of a particular flower unlike the classification of different flowers. In this research, we have proposed a model to detect rose breeds from images using transfer learning techniques. For such work in flowers, resources are not enough in image processing and classification, so we needed a large dataset of the massive number of images to train our model. we have used 1939 raw images of five different breeds and we have generated 9306 images for the training dataset and 388 images for the testing dataset to validate the model using augmentation. We have applied four transfer learning models in this research, which are Inception V3, ResNet50, Xception, and VGG16. Among these four models, VGG16 achieved the highest accuracy of 99%, which is an excellent outcome. Breed detection of a rose by using transfer learning methods is the first work on breed detection of a particular flower that is publicly available according to the study.
DeLag: Using Multi-Objective Optimization to Enhance the Detection of Latency Degradation Patterns in Service-based Systems
Traini, Luca, Cortellessa, Vittorio
Abstract--Performance debugging in production is a fundamental activity in modern service-based systems. The diagnosis of performance issues is often time-consuming, since it requires thorough inspection of large volumes of traces and performance indices. In this paper we present DeLag, a novel automated search-based approach for diagnosing performance issues in service-based systems. DeLag identifies subsets of requests that show, in the combination of their Remote Procedure Call execution times, symptoms of potentially relevant performance issues. We call such symptoms Latency Degradation Patterns. DeLag simultaneously searches for multiple latency degradation patterns while optimizing precision, recall and latency dissimilarity. Experimentation on 700 datasets of requests generated from two microservice-based systems shows that our approach provides better and more stable effectiveness than three state-of-the-art approaches and general purpose machine learning clustering algorithms. DeLag is more effective than all baseline techniques in at least one case study (with p 0.05 and non-negligible effect size). Moreover, DeLag outperforms in terms of efficiency the second and the third most effective baseline techniques on the largest datasets used in our evaluation (up to 22%). In order to support this fastpaced issue, and initial understanding, scoping and localization release cycle, IT organizations often employ several are among the most time-consuming phases during debugging. Unfortunately, frequent software releases often service-based systems [9], [10], [11], [12], [13], [14], [15], the hamper the ability to deliver high quality software [3]. For reduction of the manual effort and the time needed is still example, widely used performance assurance techniques, critical. Also, given the complexity of these systems rely on pattern mining to spot patterns in trace attributes and their workloads [6], it is often unfeasible to proactively (e.g., request size, response size, RPCs execution times) detect performance issues in a testing environment [7].
ViralVectors: Compact and Scalable Alignment-free Virome Feature Generation
Ali, Sarwan, Chourasia, Prakash, Tayebi, Zahra, Bello, Babatunde, Patterson, Murray
The amount of sequencing data for SARS-CoV-2 is several orders of magnitude larger than any virus. This will continue to grow geometrically for SARS-CoV-2, and other viruses, as many countries heavily finance genomic surveillance efforts. Hence, we need methods for processing large amounts of sequence data to allow for effective yet timely decision-making. Such data will come from heterogeneous sources: aligned, unaligned, or even unassembled raw nucleotide or amino acid sequencing reads pertaining to the whole genome or regions (e.g., spike) of interest. In this work, we propose \emph{ViralVectors}, a compact feature vector generation from virome sequencing data that allows effective downstream analysis. Such generation is based on \emph{minimizers}, a type of lightweight "signature" of a sequence, used traditionally in assembly and read mapping -- to our knowledge, the first use minimizers in this way. We validate our approach on different types of sequencing data: (a) 2.5M SARS-CoV-2 spike sequences (to show scalability); (b) 3K Coronaviridae spike sequences (to show robustness to more genomic variability); and (c) 4K raw WGS reads sets taken from nasal-swab PCR tests (to show the ability to process unassembled reads). Our results show that ViralVectors outperforms current benchmarks in most classification and clustering tasks.
BS-GAT Behavior Similarity Based Graph Attention Network for Network Intrusion Detection
Wang, Yalu, Han, Zhijie, Li, Jie, He, Xin
With the development of the Internet of Things (IoT), network intrusion detection is becoming more complex and extensive. It is essential to investigate an intelligent, automated, and robust network intrusion detection method. Graph neural networks based network intrusion detection methods have been proposed. However, it still needs further studies because the graph construction method of the existing methods does not fully adapt to the characteristics of the practical network intrusion datasets. To address the above issue, this paper proposes a graph neural network algorithm based on behavior similarity (BS-GAT) using graph attention network. First, a novel graph construction method is developed using the behavior similarity by analyzing the characteristics of the practical datasets. The data flows are treated as nodes in the graph, and the behavior rules of nodes are used as edges in the graph, constructing a graph with a relatively uniform number of neighbors for each node. Then, the edge behavior relationship weights are incorporated into the graph attention network to utilize the relationship between data flows and the structure information of the graph, which is used to improve the performance of the network intrusion detection. Finally, experiments are conducted based on the latest datasets to evaluate the performance of the proposed behavior similarity based graph attention network for the network intrusion detection. The results show that the proposed method is effective and has superior performance comparing to existing solutions.
Human Face Detection in Visual Scenes
We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images.
Reducing multiclass to binary by coupling probability estimates
This paper presents a method for obtaining class membership probability esti- mates for multiclass classification problems by coupling the probability estimates produced by binary classifiers. This is an extension for arbitrary code matrices of a method due to Hastie and Tibshirani for pairwise coupling of probability estimates. Experimental results with Boosted Naive Bayes show that our method produces calibrated class membership probability estimates, while having similar classification accuracy as loss-based decoding, a method for obtaining the most likely class that does not generate probability estimates.
On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes
We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely(cid:173) held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of per(cid:173) formance as the training set size is increased, one in which each algorithm does better. This stems from the observation- which is borne out in repeated experiments- that while discriminative learning has lower asymptotic error, a generative classifier may also approach its (higher) asymptotic error much faster.
Prodding the ROC Curve: Constrained Optimization of Classifier Performance
When designing a two-alternative classifier, one ordinarily aims to maximize the classifier's ability to discriminate between members of the two classes. We describe a situation in a real-world business application of machine-learning prediction in which an additional constraint is placed on the nature of the solu- tion: that the classifier achieve a specified correct acceptance or correct rejection rate (i.e., that it achieve a fixed accuracy on members of one class or the other). Our domain is predicting churn in the telecommunications industry. Churn refers to customers who switch from one service provider to another. We pro- pose four algorithms for training a classifier subject to this domain constraint, and present results showing that each algorithm yields a reliable improvement in performance.
Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade
This paper develops a new approach for extremely fast detection in do- mains where the distribution of positive and negative examples is highly skewed (e.g. In such domains a cascade of simple classifiers each trained to achieve high detection rates and modest false positive rates can yield a final detector with many desir- able features: including high detection rates, very low false positive rates, and fast performance. Achieving extremely high detection rates, rather than low error, is not a task typically addressed by machine learning al- gorithms. We propose a new variant of AdaBoost as a mechanism for training the simple classifiers used in the cascade. The final face detection system can process 15 frames per second, achieves over 90% detection, and a false positive rate of 1 in a 1,000,000.
AUC Optimization vs. Error Rate Minimization
The area under an ROC curve (AUC) is a criterion used in many appli- cations to measure the quality of a classification algorithm. However, the objective function optimized in most of these algorithms is the error rate and not the AUC value. We give a detailed statistical analysis of the relationship between the AUC and the error rate, including the first exact expression of the expected value and the variance of the AUC for a fixed error rate. Our results show that the average AUC is monotonically in- creasing as a function of the classification accuracy, but that the standard deviation for uneven distributions and higher error rates is noticeable. Thus, algorithms designed to minimize the error rate may not lead to the best possible AUC values.