Goto

Collaborating Authors

 Accuracy


Machine Learning Basics with Naive Bayes

#artificialintelligence

After researching and looking into the different algorithms associated with Machine Learning, I've found that there is an abundance of great material showing you how to use certain algorithms in a specific language. However what's usually missing is the simple mathematical explaination of how the algorithm works. In all cases this may not be possible without a strong mathematical background, but for some I know I would definitely find it useful. This post requires just basic mathematics knowledge and an interst in data science and machine learning. I will be talking about Naive Bayes as a classifier and explaining in simple terms how it works and when you might use it.


Sparsity-driven weighted ensemble classifier

arXiv.org Machine Learning

In this letter, a novel weighted ensemble classifier is proposed that improves classification accuracy and minimizes the number of classifiers. Ensemble weight finding problem is modeled as a cost function with following terms: (a) a data fidelity term aiming to decrease misclassification rate, (b) a sparsity term aiming to decrease the number of classifiers, and (c) a non-negativity constraint on the weights of the classifiers. The proposed cost function is a non-convex and hard to solve; thus, convex relaxation techniques and novel approximations are employed to obtain a numerically efficient solution. The proposed method achieves better or similar performance compared to state-of-the art classifier ensemble methods, while using lower number of classifiers.


Inherent Trade-Offs in the Fair Determination of Risk Scores

arXiv.org Machine Learning

Recent discussion in the public sphere about algorithmic classification has involved tension between competing notions of what it means for a probabilistic classification to be fair to different groups. We formalize three fairness conditions that lie at the heart of these debates, and we prove that except in highly constrained special cases, there is no method that can satisfy these three conditions simultaneously. Moreover, even satisfying all three conditions approximately requires that the data lie in an approximate version of one of the constrained special cases identified by our theorem. These results suggest some of the ways in which key notions of fairness are incompatible with each other, and hence provide a framework for thinking about the trade-offs between them.


When an AI machine studied declassified State Department cables, it found secrets that should have been confidential

#artificialintelligence

The U.S. State Department generates some two billion e-mails every year. A significant fraction of these contain sensitive or secret information and so have to be classified, a process that is time-consuming and costly. In 2015 alone, it spent $16 billion to protect classified information. But the reliability of this process of classification is unclear. Nobody knows whether the rules for classifying information are applied consistently and reliably.


Classifier comparison using precision

arXiv.org Machine Learning

New proposed models are often compared to state-of-the-art using statistical significance testing. Literature is scarce for classifier comparison using metrics other than accuracy. We present a survey of statistical methods that can be used for classifier comparison using precision, accounting for inter-precision correlation arising from use of same dataset. Comparisons are made using per-class precision and methods presented to test global null hypothesis of an overall model comparison. Comparisons are extended to multiple multi-class classifiers and to models using cross validation or its variants. Partial Bayesian update to precision is introduced when population prevalence of a class is known. Applications to compare deep architectures are studied.


Extending Detection with Forensic Information

arXiv.org Machine Learning

For over a quarter century, security-relevant detection has been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at run-time. However, the training of the models has been historically limited to only those features available at run time. In this paper, we consider an alternate model construction approach that trains models using forensic "privileged" information--features available at training time but not at runtime--to improve the accuracy and resilience of detection systems. In particular, we adapt and extend recent advances in knowledge transfer, model influence, and distillation to enable the use of forensic data in a range of security domains. Our empirical study shows that privileged information increases detection precision and recall over a system with no privileged information: we observe up to 7.7% relative decrease in detection error for fast-flux bot detection, 8.6% for malware traffic detection, 7.3% for malware classification, and 16.9% for face recognition. We explore the limitations and applications of different privileged information techniques in detection systems. Such techniques open the door to systems that can integrate forensic data directly into detection models, and therein provide a means to fully exploit the information available about past security-relevant events.


Pose-Selective Max Pooling for Measuring Similarity

arXiv.org Artificial Intelligence

In this paper, we deal with two challenges for measuring the similarity of the subject identities in practical video-based face recognition - the variation of the head pose in uncontrolled environments and the computational expense of processing videos. Since the frame-wise feature mean is unable to characterize the pose diversity among frames, we define and preserve the overall pose diversity and closeness in a video. Then, identity will be the only source of variation across videos since the pose varies even within a single video. Instead of simply using all the frames, we select those faces whose pose point is closest to the centroid of the K-means cluster containing that pose point. Then, we represent a video as a bag of frame-wise deep face features while the number of features has been reduced from hundreds to K. Since the video representation can well represent the identity, now we measure the subject similarity between two videos as the max correlation among all possible pairs in the two bags of features. On the official 5,000 video-pairs of the YouTube Face dataset for face verification, our algorithm achieves a comparable performance with VGG-face that averages over deep features of all frames. Other vision tasks can also benefit from the generic idea of employing geometric cues to improve the descriptiveness of deep features.


Dual Teaching: A Practical Semi-supervised Wrapper Method

arXiv.org Machine Learning

Semi-supervised wrapper methods are concerned with building effective supervised classifiers from partially labeled data. Though previous works have succeeded in some fields, it is still difficult to apply semi-supervised wrapper methods to practice because the assumptions those methods rely on tend to be unrealistic in practice. For practical use, this paper proposes a novel semi-supervised wrapper method, Dual Teaching, whose assumptions are easy to set up. Dual Teaching adopts two external classifiers to estimate the false positives and false negatives of the base learner. Only if the recall of every external classifier is greater than zero and the sum of the precision is greater than one, Dual Teaching will train a base learner from partially labeled data as effectively as the fully-labeled-data-trained classifier. The effectiveness of Dual Teaching is proved in both theory and practice.


WWE Survivor Series 2016: Multiple Title Changes Likely For PPV?

International Business Times

WWE Survivor Series 2016 could be the impetus for major changes on "Monday Night Raw" and "SmackDown." While the WWE Universal Championship and WWE World Championship won't be defended at the pay-per-view, the event on Nov. 20 could feature multiple title changes. It appears that the Intercontinental Championship and Cruiserweight Championship will be the only two belts on the line at the PPV. The stipulation added to both title matches indicates that two new champions have a good chance to emerge in Toronto. When WWE's brand split became official with the draft on July 19, the IC Title became exclusive to "SmackDown."


Bias in ML, and Teaching AI

#artificialintelligence

Yesterday I gave a super duper high level 12 minutes presentation about some issues of bias in AI. I should emphasize (if it's not clear) that this is something I am not an expert in; most of what I know is by reading great papers by other people (there is a completely non-academic sample at the end of this post). This blog post is a variant of that presentation. Structure: most of the images below are prompts for talking points, which are generally written below the corresponding image. I think I managed to link all the images to the original source (let me know if I missed one!). Automated Decision Making is Part of Our Lives To me, AI is largely the study of automated decision making, and the investment therein has been growing at a dramatic rate. The last time I taught this class was in 2012. The amount that's changed since there is incredible.