Performance Analysis
Revisiting Process versus Product Metrics: a Large Scale Analysis
Majumder, Suvodeep, Mody, Pranav, Menzies, Tim
Numerous methods can build predictive models from software data. However, what methods and conclusions should we endorse as we move from analytics in-the-small (dealing with a handful of projects) to analytics in-the-large (dealing with hundreds of projects)? To answer this question, we recheck prior small-scale results (about process versus product metrics for defect prediction and the granularity of metrics) using 722,471 commits from 700 Github projects. We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large. For example, like prior work, we see that process metrics are better predictors for defects than product metrics (best process/product-based learners respectively achieve recalls of 98\%/44\% and AUCs of 95\%/54\%, median values). That said, we warn that it is unwise to trust metric importance results from analytics in-the-small studies since those change dramatically when moving to analytics in-the-large. Also, when reasoning in-the-large about hundreds of projects, it is better to use predictions from multiple models (since single model predictions can become confused and exhibit a high variance).
Bootstrapping Concept Formation in Small Neural Networks
Tamosiunaite, Minija, Kulvicius, Tomas, Wörgötter, Florentin
The question how neural systems (of humans) can perform reasoning is still far from being solved. We posit that the process of forming Concepts is a fundamental step required for this. We argue that, first, Concepts are formed as closed representations, which are then consolidated by relating them to each other. Here we present a model system (agent) with a small neural network that uses realistic learning rules and receives only feedback from the environment in which the agent performs virtual actions. First, the actions of the agent are reflexive. In the process of learning, statistical regularities in the input lead to the formation of neuronal pools representing relations between the entities observed by the agent from its artificial world. This information then influences the behavior of the agent via feedback connections replacing the initial reflex by an action driven by these relational representations. We hypothesize that the neuronal pools representing relational information can be considered as primordial Concepts, which may in a similar way be present in some pre-linguistic animals, too. We argue that systems such as this can help formalizing the discussion about what constitutes Concepts and serve as a starting point for constructing artificial cogitating systems.
Task-Aware Meta Learning-based Siamese Neural Network for Classifying Obfuscated Malware
Zhu, Jinting, Jang-Jaccard, Julian, Singh, Amardeep, Watters, Paul A., Camtepe, Seyit
Malware authors apply different obfuscation techniques on the generic feature of malware (i.e., unique malware signature) to create new variants to avoid detection. Existing Siamese Neural Network (SNN) based malware detection methods fail to correctly classify different malware families when similar generic features are shared across multiple malware variants resulting in high false-positive rates. To address this issue, we propose a novel Task-Aware Meta Learning-based Siamese Neural Network resilient against obfuscated malware while able to detect malware trained with one or a few training samples. Using entropy features of each malware signature alongside image features as task inputs, our task-aware meta leaner generates the parameters for the feature layers to more accurately adjust the feature embedding for different malware families. In addition, our model utilizes meta-learning with the extracted features of a pre-trained network (e.g., VGG-16) to avoid the bias typically associated with a model trained with a limited number of training samples. Our proposed approach is highly effective in recognizing unique malware signatures, thus correctly classifying malware samples that belong to the same malware family even in the presence of obfuscation technique applied to malware. Our experimental results, validated with N-way on N-shot learning, show that our model is highly effective in classification accuracy exceeding the rate>91% compared to other similar methods.
Intel open-sources ControlFlag tool to find errors in code
Intel Labs has big plans for a software tool called ControlFlag that uses artificial intelligence to scan through code and pick out errors. One of those goals, perhaps way out in the future, is to bake it into chip packages as a last line of defense against faulty code. This could make the information flow on communications channels safer and efficient. Last week Intel open-sourced the tool – dubbed ControlFlag – to software developers. The software pores over lines of code and points out errors that developers can then fix.
How to Measure the Success of a Recommendation System?
Recommender systems are used in a variety of domains, from e-commerce to social media to offer personalized recommendations to customers. The benefit of recommendations for customers, such as reduced information overload, has been a hot topic of research. However, it's unclear how and to what extent recommender systems produce commercial value. It's challenging to create a reliable product suggestion system. However, defining what it means to be reliable is also a challenging task.
Where were my keys? -- Aggregating Spatial-Temporal Instances of Objects for Efficient Retrieval over Long Periods of Time
Idrees, Ifrah, Hasan, Zahid, Reiss, Steven P., Tellex, Stefanie
Robots equipped with situational awareness can help humans efficiently find their lost objects by leveraging spatial and temporal structure. Existing approaches to video and image retrieval do not take into account the unique constraints imposed by a moving camera with a partial view of the environment. We present a Detection-based 3-level hierarchical Association approach, D3A, to create an efficient query-able spatial-temporal representation of unique object instances in an environment. D3A performs online incremental and hierarchical learning to identify keyframes that best represent the unique objects in the environment. These keyframes are learned based on both spatial and temporal features and once identified their corresponding spatial-temporal information is organized in a key-value database. D3A allows for a variety of query patterns such as querying for objects with/without the following: 1) specific attributes, 2) spatial relationships with other objects, and 3) time slices. For a given set of 150 queries, D3A returns a small set of candidate keyframes (which occupy only 0.17% of the total sensory data) with 81.98\% mean accuracy in 11.7 ms. This is 47x faster and 33% more accurate than a baseline that naively stores the object matches (detections) in the database without associating spatial-temporal information.
Fast Projection onto the Capped Simplex with Applications to Sparse Regression in Bioinformatics
Ang, Andersen, Ma, Jianzhu, Liu, Nianjun, Huang, Kun, Wang, Yijie
We consider the problem of projecting a vector onto the so-called k-capped simplex, which is a hyper-cube cut by a hyperplane. For an n-dimensional input vector with bounded elements, we found that a simple algorithm based on Newton's method is able to solve the projection problem to high precision with a complexity roughly about O(n), which has a much lower computational cost compared with the existing sorting-based methods proposed in the literature. We provide a theory for partial explanation and justification of the method. We demonstrate that the proposed algorithm can produce a solution of the projection problem with high precision on large scale datasets, and the algorithm is able to significantly outperform the state-of-the-art methods in terms of runtime (about 6-8 times faster than a commercial software with respect to CPU time for input vector with 1 million variables or more). We further illustrate the effectiveness of the proposed algorithm on solving sparse regression in a bioinformatics problem. Empirical results on the GWAS dataset (with 1,500,000 single-nucleotide polymorphisms) show that, when using the proposed method to accelerate the Projected Quasi-Newton (PQN) method, the accelerated PQN algorithm is able to handle huge-scale regression problem and it is more efficient (about 3-6 times faster) than the current state-of-the-art methods.
Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels
Görtler, Jochen, Hohman, Fred, Moritz, Dominik, Wongsuphasawat, Kanit, Ren, Donghao, Nair, Rahul, Kirchner, Marc, Patel, Kayur
Abstract--The confusion matrix, a ubiquitous visualization for helping people evaluate machine learning models, is a tabular layout that compares predicted class labels against actual class labels over all data instances. We conduct formative research with machine learning practitioners at a large technology company and find that conventional confusion matrices do not support more complex data-structures found in modern-day applications, such as hierarchical and multi-output labels. To express such variations of confusion matrices, we design an algebra that models confusion matrices as probability distributions. 's utility with three case studies that help people better understand model performance and reveal hidden confusions. Machine learning is a complex, iterative design and development practice predicted class labels (synonymously, these can be flipped via a matrix [4, 24], where the goal is to generate a learned model that generalizes transpose). These visualizations are introduced in many machine to unseen data inputs. One critical step is model evaluation, testing learning courses and are simultaneously used in practice to show what and inspecting a model's performance on held-out test sets of data with pairs of classes a model confuses. Succinctly, confusion matrices are known labels. Confusion matrices show a visual proxy A ubiquitous visualization used for model evaluation, particularly for accuracy (e.g., entries on the diagonal of the matrix), which alone for classification models, is the confusion matrix: a tabular layout that has been shown to be insufficient for many evaluations [39]. Furthermore, compares a predicted class label against the actual class label for each the diagonal of a confusion matrix often contains many more class over all data instances.
How to draw ROC curve for a multi-class dataset ?
Say I have a multi-class dataset and would like to draw its associated ROC curve for one of its classes (e.g. SkLearn has a handy implementation that calculates the tpr and fpr and another function that generates the auc for you. You can just apply this to your data by treating each class on its own (all other data being negative) by looping through each class. The code below was inspired by the scikit-learn page on this topic itself. For this exercise, I will generate some synthetic sample data and for predictions as well I will create a vector from random uniform distribution.
Signal to Noise Ratio Loss Function
Ghobadzadeh, Ali, Lashkari, Amir
This work proposes a new loss function targeting classification problems, utilizing a source of information overlooked by cross entropy loss. First, we derive a series of the tightest upper and lower bounds for the probability of a random variable in a given interval. Second, a lower bound is proposed for the probability of a true positive for a parametric classification problem, where the form of probability density function (pdf) of data is given. A closed form for finding the optimal function of unknowns is derived to maximize the probability of true positives. Finally, for the case that the pdf of data is unknown, we apply the proposed boundaries to find the lower bound of the probability of true positives and upper bound of the probability of false positives and optimize them using a loss function which is given by combining the boundaries. We demonstrate that the resultant loss function is a function of the signal to noise ratio both within and across logits. We empirically evaluate our proposals to show their benefit for classification problems.