AITopics

1905.11681

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > Experimental Study (0.95)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yates, Darren, Islam, Md Zahidul, Gao, Junbin

DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets

arXiv.org Machine LearningJun-9-2019

Smartphones have become the ultimate'personal' computer, yet despite this, general-purpose data mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data mining application designed specifically for Android devices that imports the Weka data mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can be expanded with additional algorithms. Combined, DataLearner delivers 40 classification, clustering and association rule mining algorithms for model training and evaluation without need for cloud computing resources or network connectivity. It provides the same classification accuracy as PCs and laptops, while doing so with acceptable processing speed and consuming negligible battery life. With its ability to provide easy-to-use data mining on a phone-size screen, DataLearner is a new portable, self-contained data mining tool for remote, personalised and educational applications alike. DataLearner features four elements - this paper, the app available on Google Play, the GPL3-licensed source code on GitHub and a short video on YouTube.

artificial intelligence, data mining, machine learning, (18 more...)

1906.03773

Country:

Oceania > New Zealand > North Island > Waikato (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.93)
Information Technology > Software (0.49)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Data Science > Data Mining > Knowledge Discovery (0.63)
(2 more...)

arXiv.org Machine LearningJun-8-2019

ML-LOO: Detecting Adversarial Examples with Feature Attribution

Yang, Puyudi, Chen, Jianbo, Hsieh, Cho-Jui, Wang, Jane-Ling, Jordan, Michael I.

Deep neural networks obtain state-of-the-art performance on a series of tasks. However, they are easily fooled by adding a small adversarial perturbation to input. The perturbation is often human imperceptible on image data. We observe a significant difference in feature attributions of adversarially crafted examples from those of original ones. Based on this observation, we introduce a new framework to detect adversarial examples through thresholding a scale estimate of feature attribution scores. Furthermore, we extend our method to include multi-layer feature attributions in order to tackle the attacks with mixed confidence levels. Through vast experiments, our method achieves superior performances in distinguishing adversarial examples from popular attack methods on a variety of real data sets among state-of-the-art detection methods. In particular, our method is able to detect adversarial examples of mixed confidence levels, and transfer between different attacking methods.

artificial intelligence, auc, machine learning, (17 more...)

1906.03499

Country: North America > United States > California (0.93)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Tai, Xiao Hui, Frisoli, Kayla

Benchmarking Minimax Linkage

Minimax linkage was first introduced by Ao et al. [3] in 2004, as an alternative to standard linkage methods used in hierarchical clustering. Minimax linkage relies on distances to a prototype for each cluster; this prototype can be thought of as a representative object in the cluster, hence improving the interpretability of clustering results. Bien and Tibshirani analyzed properties of this method in 2011 [2], popularizing the method within the statistics community. Additionally, they performed comparisons of minimax linkage to standard linkage methods, making use of five data sets and two different evaluation metrics (distance to prototype and misclassification rate). In an effort to expand upon their work and evaluate minimax linkage more comprehensively, our benchmark study focuses on thorough method evaluation via multiple performance metrics on several well-described data sets. We also make all code and data publicly available through an R package, for full reproducibility. Similarly to [2], we find that minimax linkage often produces the smallest maximum minimax radius of all linkage methods, meaning that minimax linkage produces clusters where objects in a cluster are tightly clustered around their prototype. This is true across a range of values for the total number of clusters (k). However, this is not always the case, and special attention should be paid to the case when k is the true known value. For true k, minimax linkage does not always perform the best in terms of all the evaluation metrics studied, including maximum minimax radius. This paper was motivated by the IFCS Cluster Benchmarking Task Force's call for clustering benchmark studies and the white paper [5], which put forth guidelines and principles for comprehensive benchmarking in clustering. Our work is designed to be a neutral benchmark study of minimax linkage.

artificial intelligence, linkage, machine learning, (13 more...)

1906.03336

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Missouri (0.04)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

Ensemble Pruning via Margin Maximization

Martinez, Waldyn

Ensemble models refer to methods that combine a typically large number of classifiers into a compound prediction. The output of an ensemble method is the result of fitting a base-learning algorithm to a given data set, and obtaining diverse answers by reweighting the observations or by resampling them using a given probabilistic selection. A key challenge of using ensembles in large-scale multidimensional data lies in the complexity and the computational burden associated with them. The models created by ensembles are often difficult, if not impossible, to interpret and their implementation requires more computational power than single classifiers. Recent research effort in the field has concentrated in reducing ensemble size, while maintaining their predictive accuracy. We propose a method to prune an ensemble solution by optimizing its margin distribution, while increasing its diversity. The proposed algorithm results in an ensemble that uses only a fraction of the original classifiers, with improved or similar generalization performance. We analyze and test our method on both synthetic and real data sets. The simulations show that the proposed method compares favorably to the original ensemble solutions and to other existing ensemble pruning methodologies.

algorithm, classifier, ensemble, (16 more...)

1906.03247

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Ohio > Butler County > Oxford (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
(2 more...)

Sun, Yang-Wen, Papagiannouli, Katerina, Spokoiny, Vladmir

Online Graph-Based Change-Point Detection for High Dimensional Data

Online change-point detection (OCPD) is important for application in various areas such as finance, biology, and the Internet of Things (IoT). However, OCPD faces major challenges due to high-dimensionality, and it is still rarely studied in literature. In this paper, we propose a novel, online, graph-based, change-point detection algorithm to detect change of distribution in low- to high-dimensional data. We introduce a similarity measure, which is derived from the graph-spanning ratio, to test statistically if a change occurs. Through numerical study using artificial online datasets, our data-driven approach demonstrates high detection power for high-dimensional data, while the false alarm rate (type I error) is controlled at a nominal significant level. In particular, our graph-spanning approach has desirable power with small and multiple scanning window, which allows timely detection of change-point in the online setting.

artificial intelligence, detection, machine learning, (16 more...)

1906.03001

Genre: Research Report (0.64)

Industry: Information Technology > Smart Houses & Appliances (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Lift Up and Act! Classifier Performance in Resource-Constrained Applications

Shmueli, Galit

Classification tasks are common across many fields and applications where the decision maker's action is limited by resource constraints. In direct marketing only a subset of customers is contacted; scarce human resources limit the number of interviews to the most promising job candidates; limited donated organs are prioritized to those with best fit. In such scenarios, performance measures such as the classification matrix, ROC analysis, and even ranking metrics such as AUC measures outcomes different from the action of interest. At the same time, gains and lift that do measure the relevant outcome are rarely used by machine learners. In this paper we define resource-constrained classifier performance as a task distinguished from classification and ranking. We explain how gains and lift can lead to different algorithm choices and discuss the effect of class distribution.

artificial intelligence, classifier, machine learning, (18 more...)

1906.03374

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Awasthi, Pranjal, Kleindessner, Matthäus, Morgenstern, Jamie

Effectiveness of Equalized Odds for Fair Classification under Imperfect Group Information

Most approaches for ensuring or improving a model's fairness with respect to a protected attribute (such as race or gender) assume access to the true value of the protected attribute for every data point. In many scenarios, however, perfect knowledge of the protected attribute is unrealistic. In this paper, we ask to what extent fairness interventions can be effective even with imperfect information about the protected attribute. In particular, we study this question in the context of the prominent equalized odds method of Hardt et al. (2016). We claim that as long as the perturbation of the protected attribute is somewhat moderate, one should still run equalized odds if one would run it knowing the true protected attribute: the bias of the classifier that we obtain using the perturbed attribute is smaller than the bias of the original classifier, and its error is not larger than the error of the equalized odds classifier obtained when working with the true protected attribute.

classifier, data mining, machine learning, (18 more...)

1906.03284

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Gurina, Ekaterina, Klyuchnikov, Nikita, Zaytsev, Alexey, Romanenkova, Evgenya, Antipova, Ksenia, Simon, Igor, Makarov, Victor, Koroteev, Dmitry

Failures detection at directional drilling using real-time analogues search

arXiv.org Machine LearningJun-6-2019

One of the main challenges in the construction of oil and gas wells is the need to detect and avoid abnormal situations, which can lead to accidents. Accidents have some indicators that help to find them during the drilling process. In this article, we present a data-driven model trained on historical data from drilling accidents that can detect different types of accidents using real-time signals. The results show that using the time-series comparison, based on aggregated statistics and gradient boosting classification, it is possible to detect an anomaly and identify its type by comparing current measurements while drilling with the stored ones from the database of accidents.

accident, artificial intelligence, upstream oil & gas, (15 more...)

1906.02667

Country: Europe > Russia (0.28)

Genre: Research Report (0.70)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Security & Privacy (0.93)
(3 more...)

arXiv.org Machine LearningJun-6-2019

Globally-Aware Multiple Instance Classifier for Breast Cancer Screening

Shen, Yiqiu, Wu, Nan, Phang, Jason, Park, Jungkyu, Kim, Gene, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

Deep learning models designed for visual classification tasks on natural images have become prevalent in medical image analysis. However, medical images differ from typical natural images in many ways, such as significantly higher resolutions and smaller regions of interest. Moreover, both the global structure and local details play important roles in medical image analysis tasks. To address these unique properties of medical images, we propose a neural network that is able to classify breast cancer lesions utilizing information from both a global saliency map and multiple local patches. The proposed model outperforms the ResNet-based baseline and achieves radiologist-level performance in the interpretation of screening mammography. Although our model is trained only with image-level labels, it is able to generate pixel-level saliency maps that provide localization of possible malignant findings.

artificial intelligence, machine learning, prediction, (19 more...)

1906.02846

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)