AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Two-Sample Testing for Event Impacts in Time Series

arXiv.org Machine LearningJan-31-2020

In many application domains, time series are monitored to detect extreme events like technical faults, natural disasters, or disease outbreaks. Unfortunately, it is often non-trivial to select both a time series that is informative about events and a powerful detection algorithm: detection may fail because the detection algorithm is not suitable, or because there is no shared information between the time series and the events of interest. In this work, we thus propose a non-parametric statistical test for shared information between a time series and a series of observed events. Our test allows identifying time series that carry information on event occurrences without committing to a specific event detection methodology. In a nutshell, we test for divergences of the value distributions of the time series at increasing lags after event occurrences with a multiple two-sample testing approach. In contrast to related tests, our approach is applicable for time series over arbitrary domains, including multivariate numeric, strings or graphs. We perform a large-scale simulation study to show that it outperforms or is on par with related tests on our task for univariate time series. We also demonstrate the real-world applicability of our approach on datasets from social media and smart home environments.

health & medicine, immunology, time series, (17 more...)

arXiv.org Machine Learning

2001.1193

Country:

North America > United States > New Jersey (0.28)
Europe > Germany (0.14)

Genre: Research Report > Experimental Study (0.69)

Industry:

Energy > Oil & Gas (0.46)
Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Data Science (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)

Add feedback

Causal Structure Discovery from Distributions Arising from Mixtures of DAGs

Saeed, Basil, Panigrahi, Snigdha, Uhler, Caroline

arXiv.org Machine LearningJan-31-2020

We consider distributions arising from a mixture of causal models, where each model is represented by a directed acyclic graph (DAG). We provide a graphical representation of such mixture distributions and prove that this representation encodes the conditional independence relations of the mixture distribution. We then consider the problem of structure learning based on samples from such distributions. Since the mixing variable is latent, we consider causal structure discovery algorithms such as FCI that can deal with latent variables. We show that such algorithms recover a "union" of the component DAGs and can identify variables whose conditional distribution across the component DAGs vary. We demonstrate our results on synthetic and real data showing that the inferred graph identifies nodes that vary between the different mixture components. As an immediate application, we demonstrate how retrieval of this causal information can be used to cluster samples according to each mixture component.

causal structure discovery, dag, graph, (12 more...)

arXiv.org Machine Learning

2001.1194

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(2 more...)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Add feedback

Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook

Silva, Márcio, de Oliveira, Lucas Santos, Andreou, Athanasios, de Melo, Pedro Olmo Vaz, Goga, Oana, Benevenuto, Fabrício

arXiv.org Artificial IntelligenceJan-31-2020

The 2016 United States presidential election was marked by the abuse of targeted advertising on Facebook. Concerned with the risk of the same kind of abuse to happen in the 2018 Brazilian elections, we designed and deployed an independent auditing system to monitor political ads on Facebook in Brazil. To do that we first adapted a browser plugin to gather ads from the timeline of volunteers using Facebook. We managed to convince more than 2000 volunteers to help our project and install our tool. Then, we use a Convolution Neural Network (CNN) to detect political Facebook ads using word embeddings. To evaluate our approach, we manually label a data collection of 10k ads as political or non-political and then we provide an in-depth evaluation of proposed approach for identifying political ads by comparing it with classic supervised machine learning methods. Finally, we deployed a real system that shows the ads identified as related to politics. We noticed that not all political ads we detected were present in the Facebook Ad Library for political ads. Our results emphasize the importance of enforcement mechanisms for declaring political ads and the need for independent auditing platforms.

advertiser, facebook, political ad, (15 more...)

arXiv.org Artificial Intelligence

2001.10581

Country:

Asia > Russia (0.14)
Europe > Germany (0.14)
Asia > South Korea (0.14)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Early detection of eye disease assisted by Deep Learning

#artificialintelligenceJan-30-2020, 12:17:21 GMT

The eye disease diabetic retinopathy is the fastest growing cause of blindness in the world. There are over 100 million people in the world with diabetes and ideally, they would be screened each year for this degenerative eye condition. It is fully treatable if caught early, however if it's not detected you could suffer partial or full vision loss. They look for scattered hemorrhages and micro-aneurysms to grade the image. This is quite subjective in that a grade of two (2) means come back in a year and a grade of three (3) means come to the clinic right away.

diabetic retinopathy, early detection, eye disease, (11 more...)

#artificialintelligence

Country: Asia > India (0.06)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback

Better Multi-class Probability Estimates for Small Data Sets

Alasalmi, Tuomo, Suutala, Jaakko, Koskimäki, Heli, Röning, Juha

arXiv.org Machine LearningJan-30-2020

Many classification applications require accurate probability estimates in addition to good class separation but often classifiers are designed focusing only on the latter. Calibration is the process of improving probability estimates by post-processing but commonly used calibration algorithms work poorly on small data sets and assume the classification task to be binary. Both of these restrictions limit their real-world applicability. Previously introduced Data Generation and Grouping algorithm alleviates the problem posed by small data sets and in this article, we will demonstrate that its application to multi-class problems is also possible which solves the other limitation. Our experiments show that calibration error can be decreased using the proposed approach and the additional computational cost is acceptable.

binary problem, calibration, classifier, (15 more...)

arXiv.org Machine Learning

2001.11242

Country:

Europe > Finland > Northern Ostrobothnia > Oulu (0.05)
Oceania > Australia > Tasmania (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Add feedback

Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for Supporting Active Learning

Ortiz-Díaz, Agustín Alejandro, Baldo, Fabiano, Mariño, Laura María Palomino, Cabrera, Alberto Verdecia

arXiv.org Artificial IntelligenceJan-30-2020

Classification algorithms to mine data stream have been extensively studied in recent years. However, a lot of these algorithms are designed for supervised learning which requires labeled instances. Nevertheless, the labeling of the data is costly and time-consuming. Because of this, alternative learning paradigms have been proposed to reduce the cost of the labeling process without significant loss of model performance. Active learning is one of these paradigms, whose main objective is to build classification models that request the lowest possible number of labeled examples achieving adequate levels of accuracy. Therefore, this work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning. FASE-AL is based on the algorithm Fast Adaptive Stacking of Ensembles (FASE). FASE is an ensemble algorithm that detects and adapts the model when the input data stream has concept drift. FASE-AL was compared with four different strategies of active learning found in the literature. Real and synthetic databases were used in the experiments. The algorithm achieves promising results in terms of the percentage of correctly classified instances.

algorithm, concept drift, fase, (17 more...)

arXiv.org Artificial Intelligence

2001.11466

Country:

South America > Brazil > Santa Catarina (0.05)
South America > Brazil > Pernambuco > Recife (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Extreme Algorithm Selection With Dyadic Feature Representation

Tornede, Alexander, Wever, Marcel, Hüllermeier, Eyke

arXiv.org Machine LearningJan-29-2020

Algorithm selection (AS) deals with selecting an algorithm from a fixed set of candidate algorithms most suitable for a specific instance of an algorithmic problem, e.g., choosing solvers for SA T problems. Benchmark suites for AS usually comprise candidate sets consisting of at most tens of algorithms, whereas in combined algorithm selection and hyperparameter optimization problems the number of candidates becomes intractable, impeding to learn effective meta-models and thus requiring costly online performance evaluations. Therefore, here we propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms, facilitating meta learning. W e assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation in which both problem instances and algorithms are described. W e find the latter to improve significantly over the current state of the art in various metrics.

algorithm, candidate algorithm, dataset, (15 more...)

arXiv.org Machine Learning

2001.10741

Country: Europe > Germany (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

How to Evaluate Your Machine Learning Models with Python Code!

#artificialintelligenceJan-28-2020, 00:06:19 GMT

You've finally built your machine learning model to predict future prices of Bitcoin so that you can finally become a multi-billionaire. But how do you know that the model you created is any good? In this article, I'm going to talk about several ways you can evaluate your machine learning model with code provided! If you don't know the difference between regression and classification models, check out here. More specifically, I'm going to cover the following metrics: R Squared is a measurement that tells you to what extent the proportion of variance in the dependent variable is explained by the variance in the independent variables.

confusion matrix, independent variable, machine learning model, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.81)

Add feedback

Interpretable Machine Learning Model for Early Prediction of Mortality in Elderly Patients with Multiple Organ Dysfunction Syndrome (MODS): a Multicenter Retrospective Study and Cross Validation

Liu, Xiaoli, Hu, Pan, Mao, Zhi, Kuo, Po-Chih, Li, Peiyao, Liu, Chao, Hu, Jie, Li, Deyu, Cao, Desen, Mark, Roger G., Celi, Leo Anthony, Zhang, Zhengbo, Zhou, Feihu

arXiv.org Machine LearningJan-28-2020

Background: Elderly patients with MODS have high risk of death and poor prognosis. The performance of current scoring systems assessing the severity of MODS and its mortality remains unsatisfactory. This study aims to develop an interpretable and generalizable model for early mortality prediction in elderly patients with MODS. Methods: The MIMIC-III, eICU-CRD and PLAGH-S databases were employed for model generation and evaluation. We used the eXtreme Gradient Boosting model with the SHapley Additive exPlanations method to conduct early and interpretable predictions of patients' hospital outcome. Three types of data source combinations and five typical evaluation indexes were adopted to develop a generalizable model. Findings: The interpretable model, with optimal performance developed by using MIMIC-III and eICU-CRD datasets, was separately validated in MIMIC-III, eICU-CRD and PLAGH-S datasets (no overlapping with training set). The performances of the model in predicting hospital mortality as validated by the three datasets were: AUC of 0.858, sensitivity of 0.834 and specificity of 0.705; AUC of 0.849, sensitivity of 0.763 and specificity of 0.784; and AUC of 0.838, sensitivity of 0.882 and specificity of 0.691, respectively. Comparisons of AUC between this model and baseline models with MIMIC-III dataset validation showed superior performances of this model; In addition, comparisons in AUC between this model and commonly used clinical scores showed significantly better performance of this model. Interpretation: The interpretable machine learning model developed in this study using fused datasets with large sample sizes was robust and generalizable. This model outperformed the baseline models and several clinical scores for early prediction of mortality in elderly ICU patients. The interpretative nature of this model provided clinicians with the ranking of mortality risk features.

interpretable machine learning model, multiple organ dysfunction syndrome, retrospective study and cross validation, (4 more...)

arXiv.org Machine Learning

2001.10977

Genre: Research Report (0.69)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.40)

Add feedback

Binary Classification from Positive Data with Skewed Confidence

Shinoda, Kazuhiko, Kaji, Hirotaka, Sugiyama, Masashi

arXiv.org Machine LearningJan-28-2020

Positive-confidence (Pconf) classification [Ishida et al., 2018] is a promising weakly-supervised learning method which trains a binary classifier only from positive data equipped with confidence. However, in practice, the confidence may be skewed by bias arising in an annotation process. The Pconf classifier cannot be properly learned with skewed confidence, and consequently, the classification performance might be deteriorated. In this paper, we introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyperparameter which cancels out the negative impact of skewed confidence under the assumption that we have the misclassification rate of positive samples as a prior knowledge. We demonstrate the effectiveness of the proposed method through a synthetic experiment with simple linear models and benchmark problems with neural network models. We also apply our method to drivers' drowsiness prediction to show that it works well with a real-world problem where confidence is obtained based on manual annotation.

classification, experiment, pconf classification, (12 more...)

arXiv.org Machine Learning

2001.10642

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Add feedback