AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Decision time: Sometimes accuracy is not your friend

#artificialintelligenceJul-6-2018, 11:56:03 GMT

Machine learning is about machines making decisions and, as we have already discussed, we can produce multiple models for any given problem and measure their accuracy. It is intuitively obvious that we would elect to use the most accurate model and most of the time, of course, we do. But there are times when we will actually elect to use one of the less accurate ones. The underlying reason is that the estimates we make of accuracy, whilst very useful, take no account of the cost of being right and being wrong. We might be trying to identify which of our customers on a clothing website are women and which men so that our recommendation engine makes the appropriate clothing suggestions.

algorithm, artificial intelligence, machine learning, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Oracle-free Detection of Translation Issue for Neural Machine Translation

Zheng, Wujie, Wang, Wenyu, Liu, Dian, Zhang, Changrong, Zeng, Qinsong, Deng, Yuetang, Yang, Wei, Xie, Tao

arXiv.org Artificial IntelligenceJul-6-2018

Neural Machine Translation (NMT) has been widely adopted over recent years due to its advantages on various translation tasks. However, NMT systems can be error-prone due to the intractability of natural languages and the design of neural networks, bringing issues to their translations. These issues could potentially lead to information loss, wrong semantics, and low readability in translations, compromising the usefulness of NMT and leading to potential non-trivial consequences. Although there are existing approaches, such as using the BLEU score, on quality assessment and issue detection for NMT, such approaches face two serious limitations. First, such solutions require oracle translations, i.e., reference translations, which are often unavailable, e.g., in production environments. Second, such approaches cannot pinpoint the issue types and locations within translations. To address such limitations, we propose a new approach aiming to precisely detect issues in translations without requiring oracle translations. Our approach focuses on two most prominent issues in NMT translations by including two detection algorithms. Our experimental results show that our new approach could achieve high effectiveness on real-world datasets. Our successful experience on deploying the proposed algorithms in both the development and production environments of WeChat, a messenger app with over one billion of monthly active users, helps eliminate numerous defects of our NMT model, monitor the effectiveness on real-world translation tasks, and collect in-house test cases, producing high industry impact.

machine learning, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

1807.0234

Country:

North America > United States > Illinois (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments

Colas, Cédric, Sigaud, Olivier, Oudeyer, Pierre-Yves

arXiv.org Machine LearningJul-5-2018

Consistently checking the statistical significance of experimental results is one of the mandatory methodological steps to address the so-called "reproducibility crisis" in deep reinforcement learning. In this tutorial paper, we explain how the number of random seeds relates to the probabilities of statistical errors. For both the t-test and the bootstrap confidence interval test, we recall theoretical guidelines to determine the number of random seeds one should use to provide a statistically significant comparison of the performance of two algorithms. Finally, we discuss the influence of deviations from the assumptions usually made by statistical tests. We show that they can lead to inaccurate evaluations of statistical errors and provide guidelines to counter these negative effects. We make our code available to perform the tests.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1806.08295

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)

Add feedback

Ensemble learning with Conformal Predictors: Targeting credible predictions of conversion from Mild Cognitive Impairment to Alzheimer's Disease

Pereira, Telma, Cardoso, Sandra, Silva, Dina, Guerreiro, Manuela, de Mendonça, Alexandre, Madeira, Sara C.

arXiv.org Machine LearningJul-5-2018

Most machine learning classifiers give predictions for new examples accurately, yet without indicating how trustworthy predictions are. In the medical domain, this hampers their integration in decision support systems, which could be useful in the clinical practice. We use a supervised learning approach that combines Ensemble learning with Conformal Predictors to predict conversion from Mild Cognitive Impairment to Alzheimer's Disease. Our goal is to enhance the classification performance (Ensemble learning) and complement each prediction with a measure of credibility (Conformal Predictors). Our results showed the superiority of the proposed approach over a similar ensemble framework with standard classifiers.

artificial intelligence, machine learning, prediction, (13 more...)

arXiv.org Machine Learning

1807.01619

Country:

Europe > Portugal > Lisbon > Lisbon (0.15)
Europe > Portugal > Faro > Faro (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Counterfactual Evaluation of Machine Learning Models

#artificialintelligenceJul-4-2018, 10:46:36 GMT

So I'm sure many of you know Stripe. It's a company that provides a platform for e-commerce. And one of the things that everyone encounters when conducting commerce online is, unsurprisingly, fraud. So before I get into the details of how we address fraud with machine learning, I want to talk a little bit about the fraud life cycle. So what typically happens in fraud is that you have an organized crime ring install malware on point-of-sale devices. For example, there was this famous breach at Target about five years ago. So you can actually go online, if you go to the deep web and buy credit card numbers that were taken from personal devices, ATM machines and so forth. What's kind of surprising and funny is that these criminals who are selling credit card numbers to smaller time criminals are quite customer service oriented. So you can say, "I want 12 credit card numbers from Wells Fargo or Citibank. I want credit card numbers that were issued in the zip codes in 94102 to 94105 and so forth." Some of them are in fact so customer serviced oriented that they guarantee you that if you are unable to commit fraud with the cards you buy, they'll give you your money back. Let's say, five years at Stripe was enough for me. I decided to leave and become a criminal, using all my knowledge.

artificial intelligence, machine learning, transaction, (15 more...)

#artificialintelligence

Industry:

Banking & Finance (1.00)
Law Enforcement & Public Safety > Fraud (0.46)
Information Technology > Services (0.34)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Learning under selective labels in the presence of expert consistency

De-Arteaga, Maria, Dubrawski, Artur, Chouldechova, Alexandra

arXiv.org Machine LearningJul-4-2018

We explore the problem of learning under selective labels in the context of algorithm-assisted decision making. Selective labels is a pervasive selection bias problem that arises when historical decision making blinds us to the true outcome for certain instances. Examples of this are common in many applications, ranging from predicting recidivism using pre-trial release data to diagnosing patients. In this paper we discuss why selective labels often cannot be effectively tackled by standard methods for adjusting for sample selection bias, even if there are no unobservables. We propose a data augmentation approach that can be used to either leverage expert consistency to mitigate the partial blindness that results from selective labels, or to empirically validate whether learning under such framework may lead to unreliable models prone to systemic discrimination.

artificial intelligence, machine learning, selective label, (17 more...)

arXiv.org Machine Learning

1807.00905

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New Hampshire (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.50)

Industry: Law (0.94)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Extracting Actionable Knowledge from Domestic Violence Discourses on Social Media

Subramani, Sudha, O'Connor, Manjula

arXiv.org Machine LearningJul-4-2018

Domestic Violence (DV) is considered as big social issue and there exists a strong relationship between DV and health impacts of the public. Existing research studies have focused on social media to track and analyse real world events like emerging trends, natural disasters, user sentiment analysis, political opinions, and health care. However there is less attention given on social welfare issues like DV and its impact on public health. Recently, the victims of DV turned to social media platforms to express their feelings in the form of posts and seek the social and emotional support, for sympathetic encouragement, to show compassion and empathy among public. But, it is difficult to mine the actionable knowledge from large conversational datasets from social media due to the characteristics of high dimensions, short, noisy, huge volume, high velocity, and so on. Hence, this paper will propose a novel framework to model and discover the various themes related to DV from the public domain. The proposed framework would possibly provide unprecedentedly valuable information to the public health researchers, national family health organizations, government and public with data enrichment and consolidation to improve the social welfare of the community. Thus provides actionable knowledge by monitoring and analysing continuous and rich user generated content.

data mining, information, machine learning, (19 more...)

arXiv.org Machine Learning

1807.02391

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Middle East > Iran (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(4 more...)

Add feedback

Breast Cancer Diagnosis via Classification Algorithms

Entezari, Reihaneh

arXiv.org Machine LearningJul-3-2018

In this paper, we analyze the Wisconsin Diagnostic Breast Cancer Data using Machine Learning classification techniques, such as the SVM, Bayesian Logistic Regression (Variational Approximation), and K-Nearest-Neighbors. We describe each model, and compare their performance through different measures. We conclude that SVM has the best performance among all other classifiers, while it competes closely with the Bayesian Logistic Regression that is ranked second best method for this dataset.

artificial intelligence, logistic regression, machine learning, (14 more...)

arXiv.org Machine Learning

1807.01334

Country:

North America > United States > Wisconsin (0.25)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (0.59)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.71)

Add feedback

Big data, small lab – Physics World

#artificialintelligenceJul-2-2018, 12:46:52 GMT

The Large Hadron Collider at CERN is one of the world's largest scientific instruments. It captures 5 trillion bits of data every second, and the Geneva-based lab employs a dedicated group of experts to manage the flow. In contrast, the instrument shown here – known as a time-stretch quantitative phase imaging microscope – fits on a bench top, and is managed by a team of one. However, it is also capable of capturing an immense amount of data: 0.8 trillion bits per second. These two examples illustrate just how ubiquitous "big data" has become in physics.

data mining, machine learning, machine-learning model, (18 more...)

#artificialintelligence

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)
Information Technology > Data Science > Data Mining > Big Data (0.64)

Add feedback

A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual & Group Unfairness via Inequality Indices

Speicher, Till, Heidari, Hoda, Grgic-Hlaca, Nina, Gummadi, Krishna P., Singla, Adish, Weller, Adrian, Zafar, Muhammad Bilal

arXiv.org Machine LearningJul-2-2018

Discrimination via algorithmic decision making has received considerable attention. Prior work largely focuses on defining conditions for fairness, but does not define satisfactory measures of algorithmic unfairness. In this paper, we focus on the following question: Given two unfair algorithms, how should we determine which of the two is more unfair? Our core idea is to use existing inequality indices from economics to measure how unequally the outcomes of an algorithm benefit different individuals or groups in a population. Our work offers a justified and general framework to compare and contrast the (un)fairness of algorithmic predictors. This unifying approach enables us to quantify unfairness both at the individual and the group level. Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component. Earlier methods are typically designed to tackle only between-group unfairness, which may be justified for legal or other reasons. However, we demonstrate that minimizing exclusively the between-group component may, in fact, increase the within-group, and hence the overall unfairness. We characterize and illustrate the tradeoffs between our measures of (un)fairness and the prediction accuracy.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1145/3219819.3220046

1807.00787

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback