AITopics

2007.0907

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
(2 more...)

arXiv.org Machine LearningAug-10-2020

Generalized and Scalable Optimal Sparse Decision Trees

Lin, Jimmy, Zhong, Chudi, Hu, Diane, Rudin, Cynthia, Seltzer, Margo

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possible to construct sparse decision trees to efficiently optimize a variety of objective functions without relying on greedy splitting and pruning heuristics that often lead to suboptimal solutions. The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables. We present techniques that produce optimal decision trees over a variety of objectives including F-score, AUC, and partial area under the ROC convex hull. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables and speeds up decision tree construction by several orders of magnitude relative to the state-of-the art.

artificial intelligence, leaves, machine learning, (15 more...)

2006.0869

Country:

North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (0.63)
Workflow (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceAug-9-2020, 07:10:51 GMT

How to Evaluate the Performance of Your Machine Learning Model

Let me start with a very simple example. Robin and Sam both started preparing for an entrance exam for engineering college. They both shared a room and put equal amount of hard work while solving numerical problems. They both studied almost the same hours for the entire year and appeared in the final exam. Surprisingly, Robin cleared but Sam did not.

artificial intelligence, machine learning, probability score, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningAug-9-2020

Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19

Gysi, Deisy Morselli, Valle, Ítalo Do, Zitnik, Marinka, Ameli, Asher, Gan, Xiao, Varol, Onur, Ghiassian, Susan Dina, Patten, JJ, Davey, Robert, Loscalzo, Joseph, Barabási, Albert-László

The current pandemic has highlighted the need for methodologies that can quickly and reliably prioritize clinically approved compounds for their potential effectiveness for SARS-CoV-2 infections. In the past decade, network medicine has developed and validated multiple predictive algorithms for drug repurposing, exploiting the sub-cellular network-based relationship between a drug's targets and disease genes. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs that had been experimentally screened in VeroE6 cells, and the list of drugs under clinical trial, that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that while most algorithms offer predictive power for these ground truth data, no single method offers consistently reliable outcomes across all datasets and metrics. This prompted us to develop a multimodal approach that fuses the predictions of all algorithms, showing that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We find that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these drugs rely on network-based actions that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

pipeline, prediction, protein, (17 more...)

2004.07229

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
North America > United States > Virginia > Manassas (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Rostami, Mehrdad, Berahmand, Kamal, Forouzandeh, Saman

A Novel Community Detection Based Genetic Algorithm for Feature Selection

arXiv.org Machine LearningAug-8-2020

The selection of features is an essential data preprocessing stage in data mining. The core principle of feature selection seems to be to pick a subset of possible features by excluding features with almost no predictive information as well as highly associated redundant features. In the past several years, a variety of meta-heuristic methods were introduced to eliminate redundant and irrelevant features as much as possible from high-dimensional datasets. Among the main disadvantages of present meta-heuristic based approaches is that they are often neglecting the correlation between a set of selected features. In this article, for the purpose of feature selection, the authors propose a genetic algorithm based on community detection, which functions in three steps. The feature similarities are calculated in the first step. The features are classified by community detection algorithms into clusters throughout the second step. In the third step, features are picked by a genetic algorithm with a new community-based repair operation. Nine benchmark classification problems were analyzed in terms of the performance of the presented approach. Also, the authors have compared the efficiency of the proposed approach with the findings from four available algorithms for feature selection. The findings indicate that the new approach continuously yields improved classification accuracy.

evolutionary algorithm, machine learning, selection, (15 more...)

2008.03543

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

arXiv.org Machine LearningAug-8-2020

Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Koizumi, Yuma, Kawaguchi, Yohei, Imoto, Keisuke, Nakamura, Toshiki, Nikaido, Yuki, Tanabe, Ryo, Purohit, Harsh, Suefusa, Kaori, Endo, Takashi, Yasuda, Masahiro, Harada, Noboru

In this paper, we present the task description and discuss the results of the DCASE 2020 Challenge Task 2: Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring. The goal of anomalous sound detection (ASD) is to identify whether the sound emitted from a target machine is normal or anomalous. The main challenge of this task is to detect unknown anomalous sounds under the condition that only normal sound samples have been provided as training data. We have designed this challenge as the first benchmark of ASD research, which includes a large-scale dataset, evaluation metrics, and a simple baseline system. We received 117 submissions from 40 teams, and several novel approaches have been developed as a result of this challenge. On the basis of the analysis of the evaluation results, we discuss two new approaches and their problems.

data mining, detection, machine learning, (17 more...)

2006.05822

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)
North America > United States (0.05)
North America > Cuba > Holguín Province > Holguín (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.74)

#artificialintelligenceAug-7-2020, 10:15:57 GMT

ROC Curve and AUC -- Explained

ROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models. AUC turns the ROC curve into a numeric representation of performance for a binary classifier. AUC is the area under the ROC curve and takes a value between 0 and 1. AUC indicates how successful a model is at separating positive and negative classes. Before going in detail, let's first explain the confusion matrix and how different threshold values change the outcome of it. A confusion matrix is not a metric to evaluate a model, but it provides insight into the predictions.

artificial intelligence, machine learning, threshold value, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Stierle, Matthias, Weinzierl, Sven, Harl, Maximilian, Matzner, Martin

A Technique for Determining Relevance Scores of Process Activities using Graph-based Neural Networks

arXiv.org Artificial IntelligenceAug-7-2020

A central role in process improvement is played by the process analyst [2], who is responsible for'monitoring, measuring, and providing feedback on the performance of a business process' [3, p.45]. The ongoing implementation of information systems in organisations, along with the subsequently enhanced availability of event log data, have enabled process analysts to discover as-is models of processes with process mining with relative ease [4]. However, the crucial challenge lies in identifying potential areas for process improvements (i.e., process analysis) with respect to a strategic goal [5]; this requires analytical capabilities such as Pareto or root cause analysis [2]. A business process can be defined as a'completely closed, timely, and logical sequence of activities' [6, p.3] that realises an outcome valuable to a customer [7]. The effectiveness (i.e., customer value) and efficiency (e.g., timely, logical sequence, resource utilisation) of a business process are monitored using key performance indicators (KPIs) as aggregated measures of process outcomes; in the context of BPM, these are often referred to as process performance indicators (PPIs) [8]. Thus, to improve a business process, it is essential for a process analyst to understand the relevance of individual process activities in terms of their impact on the dimensions expressed by these performance measures.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2008.0311

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

#artificialintelligenceAug-6-2020, 20:09:10 GMT

ROC Curve in Machine Learning

The Receiver Operating Characteristic (ROC) curve is a popular tool used with binary classifiers. It is very similar to the precision/recall curve. Still, instead of plotting precision versus recall, the ROC curve plots the true positive rate (another name for recall) against the false positive rate (FPR). The FPR is the ratio of negative instances that are incorrectly classified as positive. It is equal to 1 – the true negative rate (TNR), which is the ratio of negative cases that are correctly classified as negative.

artificial intelligence, classifier, machine learning, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

#artificialintelligenceAug-6-2020, 16:46:00 GMT

Precision and Recall in Machine Learning

In Machine Learning, Precision and Recall are the two most important metrics for Model Evaluation. Precision represents the percentage of the results of your model, which are relevant to your model. The recall represents the percentage total of total pertinent results classified correctly by your machine learning algorithm. In this article, I will show you how you can apply Precision and Recall to evaluate the performance of your Machine Learning model. See Full Article -- thecleverprogrammer.com.

artificial intelligence, machine learning, precision and recall

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)