Goto

Collaborating Authors

 Performance Analysis


Artificial Intelligence Colonoscopy System Shows Promise

#artificialintelligence

Laird Harrison writes about science, health and culture. His work has appeared in national magazines, in newspapers, on public radio and on websites. He is at work on a novel about alternate realities in physics. Harrison teaches writing at the Writers Grotto.


Drowning in Data

#artificialintelligence

In 1945 the volume of human knowledge doubled every 25 years. Now, that number is 12 hours [1]. With our collective computational power rapidly increasing, vast amounts of data and our ability to assimilate it, has seeded unprecedented fertile ground for innovation. Healthtech companies are rapidly sprouting from data ridden soil at exponential rates. Cell free DNA companies, once a rarity, are becoming ubiquitous. The genomics landscape, once dominated by the few, are being inundated by a slew of competitors. Grandiose claims of being able to diagnose 50 different cancers from a single blood sample, or use AI to best dermatologists, radiologists, pathologists, etc., are being made at alarming rates. Accordingly, it's imperative to know how to assess these claims as fact or fiction, particularly when such claimants may employ "statistical misdirection". In this addition to "The Insider's Guide to Translational Medicine" we disarm perpetrators of statistical warfare of their greatest ...


Imbalanced Data? Stop Using ROC-AUC and Use AUPRC Instead

#artificialintelligence

The Receiver Operating Characteristic -- Area Under the Curve (ROC-AUC) measure is widely used to assess the performance of binary classifiers. However, sometimes, it is more appropriate to evaluate your classifier based on measuring the Area Under the Precision-Recall Curve (AUPRC). We will present a detailed comparison between these two measures, accompanied by empirical results and graphical illustrations. Scikit-learn experiments are also available in a corresponding notebook. I'll assume you're familiar with precision and recall and the elements of the confusion matrix (TP, FN, FP, TN).


Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping

arXiv.org Machine Learning

In machine learning, an agent needs to estimate uncertainty to efficiently explore and adapt and to make effective decisions. A common approach to uncertainty estimation maintains an ensemble of models. In recent years, several approaches have been proposed for training ensembles, and conflicting views prevail with regards to the importance of various ingredients of these approaches. In this paper, we aim to address the benefits of two ingredients -- prior functions and bootstrapping -- which have come into question. We show that prior functions can significantly improve an ensemble agent's joint predictions across inputs and that bootstrapping affords additional benefits if the signal-to-noise ratio varies across inputs. Our claims are justified by both theoretical and experimental results.


Spam Detection Using BERT

arXiv.org Artificial Intelligence

Abstract-Emails and SMSs are the most popular tools in today communications, and as the increase of emails and SMSs users are increase, the number of spams is also increases. Spam is any kind of unwanted, unsolicited digital communication that gets sent out in bulk, spam emails and SMSs are causing major resource wastage by unnecessarily flooding the network links. Although most spam mail originate with advertisers looking to push their products, some are much more malicious in their intent like phishing emails that aims to trick victims into giving up sensitive information like website logins or credit card information this type of cybercrime is known as phishing. To countermeasure spams, many researches and efforts are done to build spam detectors that are able to filter out messages and emails as spam or ham. In this research we build a spam detector using BERT pre-trained model that classifies emails and messages by understanding to their context, and we trained our spam detector model using multiple corpuses like SMS collection corpus, Enron corpus, SpamAssassin corpus, Ling-Spam corpus and SMS spam collection corpus, our spam detector performance was 98.62%, 97.83%, 99.13% and 99.28% respectively.


Identifying Cyber Threats Before They Happen: Deep Learning

#artificialintelligence

Crypto.com, Microsoft, NVidia, and Okta all got hacked this year. In some hacks, attackers are looking to take data, while some are just trying things out. Either way, it is in the interest of companies to patch up the holes in their security systems as more attackers are learning to take advantage of them. The project I am working on now is one to prevent cyber threats like these from happening. When a company is hacked, there is a lot at stake.


Never mind the metrics -- what about the uncertainty? Visualising confusion matrix metric distributions

arXiv.org Machine Learning

There are strong incentives to build models that demonstrate outstanding predictive performance on various datasets and benchmarks. We believe these incentives risk a narrow focus on models and on the performance metrics used to evaluate and compare them -- resulting in a growing body of literature to evaluate and compare metrics. This paper strives for a more balanced perspective on classifier performance metrics by highlighting their distributions under different models of uncertainty and showing how this uncertainty can easily eclipse differences in the empirical performance of classifiers. We begin by emphasising the fundamentally discrete nature of empirical confusion matrices and show how binary matrices can be meaningfully represented in a three dimensional compositional lattice, whose cross-sections form the basis of the space of receiver operating characteristic (ROC) curves. We develop equations, animations and interactive visualisations of the contours of performance metrics within (and beyond) this ROC space, showing how some are affected by class imbalance. We provide interactive visualisations that show the discrete posterior predictive probability mass functions of true and false positive rates in ROC space, and how these relate to uncertainty in performance metrics such as Balanced Accuracy (BA) and the Matthews Correlation Coefficient (MCC). Our hope is that these insights and visualisations will raise greater awareness of the substantial uncertainty in performance metric estimates that can arise when classifiers are evaluated on empirical datasets and benchmarks, and that classification model performance claims should be tempered by this understanding.


CS 229 - Machine Learning Tips and Tricks Cheatsheet

#artificialintelligence

In a context of a binary classification, here are the main metrics that are important to track in order to assess the performance of the model. Confusion matrix The confusion matrix is used to have a more complete picture when assessing the performance of a model. ROC The receiver operating curve, also noted ROC, is the plot of TPR versus FPR by varying the threshold. Once the model has been chosen, it is trained on the entire dataset and tested on the unseen test set. Cross-validation Cross-validation, also noted CV, is a method that is used to select a model that does not rely too much on the initial training set.


Using AI to Identify Automobiles in Hollywood Cinema

#artificialintelligence

Cars are central to the cinema in a variety of ways. While the railroad and trains were prominent during the silent era -- and in the westerns that continued to be produced well into the 1970s -- automobiles offer greater freedom of movement than trains do and thus offer greater cinematic possibilities. So extensive is this relationship that the car chase has almost become a mini-genre unto itself. Yet film scholars have not yet dedicated any work to exploring this subject in depth. But we can start by examining the relationship between cinema and transportation more broadly.


Can Occult Invasive Disease in Ductal Carcinoma In Situ Be Predicted Using Computer-extracted Mammographic Features? - PubMed

#artificialintelligence

Rationale and objectives: This study aimed to determine whether mammographic features assessed by radiologists and using computer algorithms are prognostic of occult invasive disease for patients showing ductal carcinoma in situ (DCIS) only in core biopsy. Materials and methods: In this retrospective study, we analyzed data from 99 subjects with DCIS (74 pure DCIS, 25 DCIS with occult invasion). We developed a computer-vision algorithm capable of extracting 113 features from magnification views in mammograms and combining these features to predict whether a DCIS case will be upstaged to invasive cancer at the time of definitive surgery. In comparison, we also built predictive models based on physician-interpreted features, which included histologic features extracted from biopsy reports and Breast Imaging Reporting and Data System-related mammographic features assessed by two radiologists. The generalization performance was assessed using leave-one-out cross validation with the receiver operating characteristic curve analysis.