AITopics

2005.13037

Country:

North America > United States > California > San Diego County > San Diego (0.06)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Hand, David J., Christen, Peter, Kirielle, Nishadi

F*: An Interpretable Transformation of the F-measure

arXiv.org Artificial IntelligenceJul-31-2020

The F-measure is widely used to assess the performance of classification algorithms. However, some researchers find it lacking in intuitive interpretation, questioning the appropriateness of combining two aspects of performance as conceptually distinct as precision and recall, and also questioning whether the harmonic mean is the best way to combine them. To ease this concern, we describe a simple transformation of the F-measure, which we call F* (F-star), which has an immediate practical interpretation.

artificial intelligence, f-measure, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2008.00103

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.05)
Europe > United Kingdom > England > Greater London > London (0.05)

Genre: Research Report (0.65)

Industry: Health & Medicine (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceJul-30-2020

New approach to MPI program execution time prediction

Chupakhin, A., Kolosov, A., Smeliansky, R., Antonenko, V., Ishelev, G.

The problem of MPI programs execution time prediction on a certain set of computer installations is considered. This problem emerges with orchestration and provisioning a virtual infrastructure in a cloud computing environment over a heterogeneous network of computer installations: supercomputers or clusters of servers (e.g. mini data centers). One of the key criteria for the effectiveness of the cloud computing environment is the time staying by the program inside the environment. This time consists of the waiting time in the queue and the execution time on the selected physical computer installation, to which the computational resource of the virtual infrastructure is dynamically mapped. One of the components of this problem is the estimation of the MPI programs execution time on a certain set of computer installations. This is necessary to determine a proper choice of order and place for program execution. The article proposes two new approaches to the program execution time prediction problem. The first one is based on computer installations grouping based on the Pearson correlation coefficient. The second one is based on vector representations of computer installations and MPI programs, so-called embeddings. The embedding technique is actively used in recommendation systems, such as for goods (Amazon), for articles (Arxiv.org), for videos (YouTube, Netflix). The article shows how the embeddings technique helps to predict the execution time of a MPI program on a certain set of computer installations.

artificial intelligence, cloud computing, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2007.15338

Country:

Asia > Russia (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media (0.68)
Information Technology > Services (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Architecture (0.93)
(3 more...)

Ayed, Fadhel, Stella, Lorenzo, Januschowski, Tim, Gasthaus, Jan

Anomaly Detection at Scale: The Case for Deep Distributional Time Series Models

This paper introduces a new methodology for detecting anomalies in time series data, with a primary application to monitoring the health of (micro-) services and cloud resources. The main novelty in our approach is that instead of modeling time series consisting of real values or vectors of real values, we model time series of probability distributions over real values (or vectors). This extension to time series of probability distributions allows the technique to be applied to the common scenario where the data is generated by requests coming in to a service, which is then aggregated at a fixed temporal frequency. Our method is amenable to streaming anomaly detection and scales to monitoring for anomalies on millions of time series. We show the superior accuracy of our method on synthetic and public real-world data. On the Yahoo Webscope data set, we outperform the state of the art in 3 out of 4 data sets and we show that we outperform popular open-source anomaly detection tools by up to 17% average improvement for a real-world data set.

anomaly, sery, time sery, (16 more...)

2007.15541

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Germany > Berlin (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases

Wang, Ren, Zhang, Gaoyuan, Liu, Sijia, Chen, Pin-Yu, Xiong, Jinjun, Wang, Meng

When the training data are maliciously tampered, the predictions of the acquired deep neural network (DNN) can be manipulated by an adversary known as the Trojan attack (or poisoning backdoor attack). The lack of robustness of DNNs against Trojan attacks could significantly harm real-life machine learning (ML) systems in downstream applications, therefore posing widespread concern to their trustworthiness. In this paper, we study the problem of the Trojan network (TrojanNet) detection in the data-scarce regime, where only the weights of a trained DNN are accessed by the detector. We first propose a data-limited TrojanNet detector (TND), when only a few data samples are available for TrojanNet detection. We show that an effective data-limited TND can be established by exploring connections between Trojan attack and prediction-evasion adversarial attacks including per-sample attack as well as all-sample universal attack. In addition, we propose a data-free TND, which can detect a TrojanNet without accessing any data samples. We show that such a TND can be built by leveraging the internal response of hidden neurons, which exhibits the Trojan behavior even at random noise inputs. The effectiveness of our proposals is evaluated by extensive experiments under different model architectures and datasets including CIFAR-10, GTSRB, and ImageNet.

artificial intelligence, machine learning, perturbation, (18 more...)

2007.15802

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Nepal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kocanaogullari, Aziz, Akcakaya, Murat, Erdogmus, Deniz

Stopping Criterion Design for Recursive Bayesian Classification: Analysis and Decision Geometry

Systems that are based on recursive Bayesian updates for classification limit the cost of evidence collection through certain stopping/termination criteria and accordingly enforce decision making. Conventionally, two termination criteria based on pre-defined thresholds over (i) the maximum of the state posterior distribution; and (ii) the state posterior uncertainty are commonly used. In this paper, we propose a geometric interpretation over the state posterior progression and accordingly we provide a point-by-point analysis over the disadvantages of using such conventional termination criteria. For example, through the proposed geometric interpretation we show that confidence thresholds defined over maximum of the state posteriors suffer from stiffness that results in unnecessary evidence collection whereas uncertainty based thresholding methods are fragile to number of categories and terminate prematurely if some state candidates are already discovered to be unfavorable. Moreover, both types of termination methods neglect the evolution of posterior updates. We then propose a new stopping/termination criterion with a geometrical insight to overcome the limitations of these conventional methods and provide a comparison in terms of decision accuracy and speed. We validate our claims using simulations and using real experimental data obtained through a brain computer interfaced typing system.

artificial intelligence, data mining, machine learning, (21 more...)

2007.15568

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
(2 more...)

Label-Leaks: Membership Inference Attack with Label

Li, Zheng, Zhang, Yang

Machine learning (ML) has made tremendous progress during the past decade and ML models have been deployed in many real-world applications. However, recent research has shown that ML models are vulnerable to attacks against their underlying training data. One major attack in this field is membership inference the goal of which is to determine whether a data sample is part of the training set of a target machine learning model. So far, most of the membership inference attacks against ML classifiers leverage the posteriors returned by the target model as their input. However, empirical results show that these attacks can be easily mitigated if the target model only returns the predicted label instead of posteriors. In this paper, we perform a systematic investigation of membership inference attack when the target model only provides the predicted label. We name our attack label-only membership inference attack. We focus on two adversarial settings and propose different attacks, namely transfer-based attack and perturbation based attack. The transfer-based attack follows the intuition that if a locally established shadow model is similar enough to the target model, then the adversary can leverage the shadow model's information to predict a target sample's membership. The perturbation-based attack relies on adversarial perturbation techniques to modify the target sample to a different class and uses the magnitude of the perturbation to judge whether it is a member or not. This is based on the intuition that a member sample is harder to be perturbed to a different class than a non-member sample. Extensive experiments over 6 different datasets demonstrate that both of our attacks achieve strong performance. This further demonstrates the severity of membership privacy risks of machine learning models.

artificial intelligence, machine learning, target model, (12 more...)

2007.15528

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Sawada, Azusa, Kaneko, Eiji, Sagi, Kazutoshi

Trade-offs in Top-k Classification Accuracies on Losses for Deep Learning

This paper presents an experimental analysis about trade-offs in top-k classification accuracies on losses for deep leaning and proposal of a novel top-k loss. Commonly-used cross entropy (CE) is not guaranteed to optimize top-k prediction without infinite training data and model complexities. The objective is to clarify when CE sacrifices top-k accuracies to optimize top-1 prediction, and to design loss that improve top-k accuracy under such conditions. Our novel loss is basically CE modified by grouping temporal top-k classes as a single class. To obtain a robust decision boundary, we introduce an adaptive transition from normal CE to our loss, and thus call it top-k transition loss. It is demonstrated that CE is not always the best choice to learn top-k prediction in our experiments. First, we explore trade-offs between top-1 and top-k (=2) accuracies on synthetic datasets, and find a failure of CE in optimizing top-k prediction when we have complex data distribution for a given model to represent optimal top-1 prediction. Second, we compare top-k accuracies on CIFAR-100 dataset targeting top-5 prediction in deep learning. While CE performs the best in top-1 accuracy, in top-5 accuracy our loss performs better than CE except using one experimental setup. Moreover, our loss has been found to provide better top-k accuracies compared to CE at k larger than 10. As a result, a ResNet18 model trained with our loss reaches 99 % accuracy with k=25 candidates, which is a smaller candidate number than that of CE by 8.

accuracy, artificial intelligence, machine learning, (19 more...)

2007.15359

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Vietnam > Long An Province > Tân An (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.60)

#artificialintelligenceJul-29-2020, 13:05:59 GMT

Fraud prediction; a challenge for machine learning algorithms

Fraud is a billion-dollar business and expands rapidly year by year. Thousands of people fall victim to it. Fraud always includes a false statement, misinterpretation, or deceitful conduct. Common varieties of fraud offenses include identity theft, insurance fraud, credit/debit card fraud, and mail fraud. The PwC global economic crime survey of 2018 (PwC, 2018) found that about half of the 7,200 surveyed enterprises had already experienced fraud of some kind. This is an increase compared to the PwC survey conducted in 2016 (PwC, 2016), in which slightly more than a third of organizations surveyed had experienced economic crime.

algorithm, artificial intelligence, machine learning, (15 more...)

#artificialintelligence

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)
Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

The Independent - TechJul-29-2020, 04:32:00 GMT

Face masks frustrating facial recognition technology, US agency says

A new study has found that the masks which protect people from spreading the coronavirus also have a second use, breaking facial recognition algorithms. Researchers from the National Institute of Standards and Technology have found that the best facial recognition algorithms had significantly higher error rates when trying to identify someone wearing a cloth covering. The researchers tested one-to-one matching algorithms, where a photo is compared to a different photo of the same person. This verification method is commonly used to unlock smartphones, or check passports. It drew digital masks onto the faces in a trove of border crossing photographs, and then compared those photos against another database of unmasked people seeking visas and other immigration benefits.

algorithm, artificial intelligence, machine learning, (9 more...)

The Independent - Tech

Country:

North America > United States (1.00)
Oceania > Australia (0.07)
Asia > China > Hong Kong (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)