AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes

Boominathan, Soorajnath, Oberst, Michael, Zhou, Helen, Kanjilal, Sanjat, Sontag, David

arXiv.org Machine LearningAug-12-2020

In several medical decision-making problems, such as antibiotic prescription, laboratory testing can provide precise indications for how a patient will respond to different treatment options. This enables us to "fully observe" all potential treatment outcomes, but while present in historical data, these results are infeasible to produce in real-time at the point of the initial treatment decision. Moreover, treatment policies in these settings often need to trade off between multiple competing objectives, such as effectiveness of treatment and harmful side effects. We present, compare, and evaluate three approaches for learning individualized treatment policies in this setting: First, we consider two indirect approaches, which use predictive models of treatment response to construct policies optimal for different trade-offs between objectives. Second, we consider a direct approach that constructs such a set of policies without intermediate models of outcomes. Using a medical dataset of Urinary Tract Infection (UTI) patients, we show that all approaches learn policies that achieve strictly better performance on all outcomes than clinicians, while also trading off between different objectives. We demonstrate additional benefits of the direct approach, including flexibly incorporating other goals such as deferral to physicians on simple cases.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2006.00927

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Coming to grips with actual false positive and false negative rates - Ai

#artificialintelligenceAug-11-2020, 05:50:33 GMT

While $12.7 billion of this figure goes to another merchant when a customer is turned away, it must to be noticed that false declines "are also making for a less efficient digital economy". This is because "$7.6 billion of potential spending never came about as the shopper lost interest. In the same report, a senior industry executive pointed out that re-visiting risk appetite is vital. Also, a "lot of sins can be hidden in the name of #fraud prevention, because fraud teams aren't always incentivised to have a very rigorous statistical measure of false positives and false negatives". "Many companies just don't want to get on the MasterCard and Visa chargeback programmes, and that's the guiding principle.

artificial intelligence, false negative rate, machine learning, (4 more...)

#artificialintelligence

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Semantic Clone Detection via Probabilistic Software Modeling

Thaller, Hannes, Linsbauer, Lukas, van Bladel, Brent, Egyed, Alexander

arXiv.org Artificial IntelligenceAug-11-2020

Semantic clone detection is the process of finding program elements with similar or equal runtime behavior. For example, detecting the semantic equality between the recursive and iterative implementation of the factorial computation. Semantic clone detection is the de facto technical boundary of clone detectors. This boundary was tested over the last years with interesting new approaches. This work contributes a semantic clone detection approach that detects clones with 0% syntactic similarity. We present Semantic Clone Detection via Probabilistic Software Modeling (SCD-PSM) as a stable and precise solution to semantic clone detection. PSM builds a probabilistic model of a program that is capable of evaluating and generating runtime data. SCD-PSM leverages this model and its model elements to finding behaviorally equal model elements. This behavioral equality is then generalized to semantic equality of the original program elements. It uses the likelihood between model elements as a distance metric. Then, it employs the likelihood ratio significance test to decide whether this distance is significant, given a pre-specified and controllable false-positive rate. The output of SCD-PSM are pairs of program elements (i.e., methods), their distance, and a decision whether they are clones or not. SCD-PSM yields excellent results with a Matthews Correlation Coefficient greater 0.9. These results are obtained on classical semantic clone detection problems such as detecting recursive and iterative versions of an algorithm, but also on complex problems used in coding competitions.

artificial intelligence, clone, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2008.04891

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Measles Rash Identification Using Residual Deep Convolutional Neural Network

Glock, Kimberly, Napier, Charlie, Louie, Andre, Gary, Todd, Gigante, Joseph, Schaffner, William, Wang, Qingguo

arXiv.org Artificial IntelligenceAug-11-2020

Measles is extremely contagious and is one of the leading causes of vaccine-preventable illness and death in developing countries, claiming more than 100,000 lives each year. Measles was declared eliminated in the US in 2000 due to decades of successful vaccination for the measles. As a result, an increasing number of US healthcare professionals and the public have never seen the disease. Unfortunately, the Measles resurged in the US in 2019 with 1,282 confirmed cases. To assist in diagnosing measles, we collected more than 1300 images of a variety of skin conditions, with which we employed residual deep convolutional neural network to distinguish measles rash from other skin conditions, in an aim to create a phone application in the future. On our image dataset, our model reaches a classification accuracy of 95.2%, sensitivity of 81.7%, and specificity of 97.1%, indicating the model is effective in facilitating an accurate detection of measles to help contain measles outbreaks.

artificial intelligence, machine learning, rash, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigData52589.2021.9671333

2005.09112

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
Asia (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Feature Ranking for Semi-supervised Learning

Petković, Matej, Džeroski, Sašo, Kocev, Dragi

arXiv.org Machine LearningAug-10-2020

The data made available for analysis are becoming more and more complex along several directions: high dimensionality, number of examples and the amount of labels per example. This poses a variety of challenges for the existing machine learning methods: coping with dataset with a large number of examples that are described in a high-dimensional space and not all examples have labels provided. For example, when investigating the toxicity of chemical compounds there are a lot of compounds available, that can be described with information rich high-dimensional representations, but not all of the compounds have information on their toxicity. To address these challenges, we propose semi-supervised learning of feature ranking. The feature rankings are learned in the context of classification and regression as well as in the context of structured output prediction (multi-label classification, hierarchical multi-label classification and multi-target regression). To the best of our knowledge, this is the first work that treats the task of feature ranking within the semi-supervised structured output prediction context. More specifically, we propose two approaches that are based on tree ensembles and the Relief family of algorithms. The extensive evaluation across 38 benchmark datasets reveals the following: Random Forests perform the best for the classification-like tasks, while for the regression-like tasks Extra-PCTs perform the best, Random Forests are the most efficient method considering induction times across all tasks, and semi-supervised feature rankings outperform their supervised counterpart across a majority of the datasets from the different tasks.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Machine Learning

2008.03937

Country:

North America > United States (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Europe > United Kingdom > Wales > Ceredigion > Aberystwyth (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Hybrid Discriminative-Generative Training via Contrastive Learning

Liu, Hao, Abbeel, Pieter

arXiv.org Machine LearningAug-10-2020

Contrastive learning and supervised learning have both seen significant progress and success. However, thus far they have largely been treated as two separate objectives, brought together only by having a shared neural network. In this paper we show that through the perspective of hybrid discriminative-generative training of energy-based models we can make a direct connection between contrastive learning and supervised learning. Beyond presenting this unified view, we show our specific choice of approximation of the energy-based loss outperforms the existing practice in terms of classification accuracy of WideResNet on CIFAR-10 and CIFAR-100. It also leads to improved performance on robustness, out-of-distribution detection, and calibration.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Machine Learning

2007.0907

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
(2 more...)

Add feedback

Generalized and Scalable Optimal Sparse Decision Trees

Lin, Jimmy, Zhong, Chudi, Hu, Diane, Rudin, Cynthia, Seltzer, Margo

arXiv.org Machine LearningAug-10-2020

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possible to construct sparse decision trees to efficiently optimize a variety of objective functions without relying on greedy splitting and pruning heuristics that often lead to suboptimal solutions. The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables. We present techniques that produce optimal decision trees over a variety of objectives including F-score, AUC, and partial area under the ROC convex hull. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables and speeds up decision tree construction by several orders of magnitude relative to the state-of-the art.

artificial intelligence, leaves, machine learning, (15 more...)

arXiv.org Machine Learning

2006.0869

Country:

North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (0.63)
Workflow (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

How to Evaluate the Performance of Your Machine Learning Model

#artificialintelligenceAug-9-2020, 07:10:51 GMT

Let me start with a very simple example. Robin and Sam both started preparing for an entrance exam for engineering college. They both shared a room and put equal amount of hard work while solving numerical problems. They both studied almost the same hours for the entire year and appeared in the final exam. Surprisingly, Robin cleared but Sam did not.

artificial intelligence, machine learning, probability score, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19

Gysi, Deisy Morselli, Valle, Ítalo Do, Zitnik, Marinka, Ameli, Asher, Gan, Xiao, Varol, Onur, Ghiassian, Susan Dina, Patten, JJ, Davey, Robert, Loscalzo, Joseph, Barabási, Albert-László

arXiv.org Machine LearningAug-9-2020

The current pandemic has highlighted the need for methodologies that can quickly and reliably prioritize clinically approved compounds for their potential effectiveness for SARS-CoV-2 infections. In the past decade, network medicine has developed and validated multiple predictive algorithms for drug repurposing, exploiting the sub-cellular network-based relationship between a drug's targets and disease genes. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs that had been experimentally screened in VeroE6 cells, and the list of drugs under clinical trial, that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that while most algorithms offer predictive power for these ground truth data, no single method offers consistently reliable outcomes across all datasets and metrics. This prompted us to develop a multimodal approach that fuses the predictions of all algorithms, showing that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We find that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these drugs rely on network-based actions that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

pipeline, prediction, protein, (17 more...)

arXiv.org Machine Learning

2004.07229

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
North America > United States > Virginia > Manassas (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Step by step guide to explaining your ML project during a data science interview.

#artificialintelligenceAug-8-2020, 03:30:32 GMT

This is Part 2 of the Interview Question series that I recently started. In Part 1, we talked about another important data science interview question pertaining to scaling your ML model. Be sure to check that out! A typical open-ended question that often comes up during interviews (both first and second round) is related to your personal (or side) projects. And trust me when I say this, this question is the best thing that can happen to you during an interview.

artificial intelligence, dataset, machine learning, (17 more...)

#artificialintelligence

Genre: Personal > Interview (1.00)

Industry:

Information Technology > Services (0.69)
Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.31)

Add feedback