AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Training wide residual networks for deployment using a single bit for each weight

arXiv.org Machine LearningFeb-23-2018

For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bit-per-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9%, 18.5% and 26.0% / 8.5% (Top-1 / Top-5) respectively. We also considered MNIST, SVHN and ImageNet32, achieving 1-bit-per-weight test results of 0.27%, 1.9%, and 41.3% / 19.1% respectively. For CIFAR, our error rates halve previously reported values, and are within about 1% of our error-rates for the same network with full-precision weights. For networks that overfit, we also show significant improvements in error rate by not learning batch normalization scale and offset parameters. This applies to both full precision and 1-bit-per-weight networks. Using a warm-restart learning-rate schedule, we found that training for 1-bit-per-weight is just as fast as full-precision networks, with better accuracy than standard schedules, and achieved about 98%-99% of peak performance in just 62 training epochs for CIFAR-10/100. For full training code and trained models in MATLAB, Keras and PyTorch see https://github.com/McDonnell-Lab/1-bit-per-weight/ .

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

1802.0853

Country:

Oceania > Australia (0.28)
North America > United States (0.28)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples

Lee, Kimin, Lee, Honglak, Lee, Kibok, Shin, Jinwoo

arXiv.org Machine LearningFeb-23-2018

The problem of detecting whether a test sample is from in-distribution (i.e., training distribution by a classifier) or out-of-distribution sufficiently different from it arises in many real-world machine learning applications. However, the state-of-art deep neural networks are known to be highly overconfident in their predictions, i.e., do not distinguish in- and out-of-distributions. Recently, to handle this issue, several threshold-based detectors have been proposed given pre-trained neural classifiers. However, the performance of prior works highly depends on how to train the classifiers since they only focus on improving inference procedures. In this paper, we develop a novel training method for classifiers so that such inference algorithms can work better. In particular, we suggest two additional terms added to the original loss (e.g., cross entropy). The first one forces samples from out-of-distribution less confident by the classifier and the second one is for (implicitly) generating most effective training samples for the first one. In essence, our method jointly trains both classification and generative neural networks for out-of-distribution. We demonstrate its effectiveness using deep convolutional neural networks on various popular image datasets.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Machine Learning

1711.09325

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.50)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High-Dimensional Vector Semantics

Andrecut, M.

arXiv.org Artificial IntelligenceFeb-23-2018

In many natural language processing tasks the words and the documents are represented using the "bag of words" model. In such a model, a document is represented by a high-dimensional vector, with the components corresponding to the frequency of a particular word in the document (for a detailed discussion see [1-3] and the references within). For example, assuming an English vocabulary of 25, 000 words, each document will be represented by a 25, 000 dimensional vector, where the component i is the frequency of the ith word in the document. The vector representation is particularly useful in text classification tasks, where the similarity of two documents can be simply estimated using the dot product between the vectors. If the vectors are normalized, then their dot product is equal to the cosine of the angle between the vectors, and therefore the more parallel the vectors are, the more similar the documents are.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1142/S0129183118500158

1802.09914

Country:

Europe (0.68)
North America > Canada (0.46)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)

Add feedback

Vote-boosting ensembles

Sabzevari, Maryam, Martínez-Muñoz, Gonzalo, Suárez, Alberto

arXiv.org Machine LearningFeb-21-2018

Vote-boosting is a sequential ensemble learning method in which the individual classifiers are built on different weighted versions of the training data. To build a new classifier, the weight of each training instance is determined in terms of the degree of disagreement among the current ensemble predictions for that instance. For low class-label noise levels, especially when simple base learners are used, emphasis should be made on instances for which the disagreement rate is high. When more flexible classifiers are used and as the noise level increases, the emphasis on these uncertain instances should be reduced. In fact, at sufficiently high levels of class-label noise, the focus should be on instances on which the ensemble classifiers agree. The optimal type of emphasis can be automatically determined using cross-validation. An extensive empirical analysis using the beta distribution as emphasis function illustrates that vote-boosting is an effective method to generate ensembles that are both accurate and robust.

artificial intelligence, classifier, inductive learning, (19 more...)

arXiv.org Machine Learning

1606.09458

Country:

North America > United States (0.46)
Europe > Spain (0.14)
Europe > Germany (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (1.00)
Energy > Oil & Gas > Midstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Add feedback

Adversarial classification: An adversarial risk analysis approach

Naveiro, Roi, Redondo, Alberto, Insua, David Ríos, Ruggeri, Fabrizio

arXiv.org Machine LearningFeb-21-2018

Classification is one of the most widely used instances of supervised learning, with applications in numerous fields including spam detection, Fan et al. (2016); computer vision, Chen (2015); and genomics, Zhou et al. (2005). In recent years, the field has experienced an enormous growth becoming a major research area in statistics and machine learning, Efron and Hastie (2016). Most efforts in classification have focused on obtaining more accurate algorithms which, however, largely ignore a relevant issue in many applications: the presence of adversaries who actively manipulate the data to fool the classifier so as to attain a benefit. As an example, when a spammer makes the classifier think that a spam is legit, he may profit by selling the information he gets from the victim. In such contexts, as classification algorithms improve, adversaries usually become smarter when making attacks.

artificial intelligence, machine learning, spam filtering, (18 more...)

arXiv.org Machine Learning

1802.07513

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Cross-Modality Synthesis from CT to PET using FCN and GAN Networks for Improved Automated Lesion Detection

Ben-Cohen, Avi, Klang, Eyal, Raskin, Stephen P., Soffer, Shelly, Ben-Haim, Simona, Konen, Eli, Amitai, Michal Marianne, Greenspan, Hayit

arXiv.org Artificial IntelligenceFeb-21-2018

In this work we present a novel system for generation of virtual PET images using CT scans. We combine a fully convolutional network (FCN) with a conditional generative adversarial network (GAN) to generate simulated PET data from given input CT data. The synthesized PET can be used for false-positive reduction in lesion detection solutions. Clinically, such solutions may enable lesion detection and drug treatment evaluation in a CT-only environment, thus reducing the need for the more expensive and radioactive PET/CT scan. Our dataset includes 60 PET/CT scans from Sheba Medical center. We used 23 scans for training and 37 for testing. Different schemes to achieve the synthesized output were qualitatively compared. Quantitative evaluation was conducted using an existing lesion detection software, combining the synthesized PET as a false positive reduction layer for the detection of malignant lesions in the liver. Current results look promising showing a 28% reduction in the average false positive per case from 2.9 to 2.1. The suggested solution is comprehensive and can be expanded to additional body organs, and different modalities.

artificial intelligence, lesion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1802.07846

Country: Asia > Middle East > Israel (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

WWE Elimination Chamber 2018: Predictions, Matches For Final 'Raw' PPV Before WrestleMania 34

International Business TimesFeb-20-2018, 15:57:41 GMT

When WWE Elimination Chamber 2018 takes place Sunday night in Las Vegas, it'll be the final "Monday Night Raw" pay-per-view before WrestleMania 34. Seven of the best wrestlers in the entire company will compete to determine the main event of the year's biggest show, and Ronda Rousey will make an appearance to officially become a member of the WWE roster. Below are predictions for every match on the WWE Elimination Chamber card, though more matches could be added before the PPV begins. Men's Elimination Chamber Match (Winner to face Brock Lesnar for the Universal Title at WrestleMania) We've known for nearly a year that Roman Reigns would challenge Lesnar for the title in the WrestleMania 34 main event, and this is how the Shield member is going to get his opportunity. The real question is how will everyone else be eliminated?

artificial intelligence, machine learning, prediction, (12 more...)

International Business Times

Country: North America > United States > Nevada > Clark County > Las Vegas (0.26)

Industry: Leisure & Entertainment > Sports > Martial Arts (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.62)

Add feedback

Learning to Abstain via Curve Optimization

Alexandari, Amr, Shrikumar, Avanti, Kundaje, Anshul

arXiv.org Machine LearningFeb-20-2018

In practical applications of machine learning, it is often desirable to identify and abstain on examples where the a model's predictions are likely to be incorrect. We consider the problem of selecting a budget-constrained subset of test examples to abstain on, with the goal of maximizing performance on the remaining examples. We develop a novel approach to this problem by analytically optimizing the expected marginal improvement in a desired performance metric, such as the area under the ROC curve or Precision-Recall curve. We compare our approach to other abstention techniques for deep learning models based on posterior probability and uncertainty estimates obtained using test-time dropout. On various tasks in computer vision, natural language processing, and bioinformatics, we demonstrate the consistent effectiveness of our approach over other techniques. We also introduce novel diagnostics based on influence functions to understand the behavior of abstention methods in the presence of noisy training data, and leverage the insights to propose a new influence-based abstention method.

abstention, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1802.07024

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

Yamada, Makoto, Wu, Denny, Tsai, Yao-Hung Hubert, Takeuchi, Ichiro, Salakhutdinov, Ruslan, Fukumizu, Kenji

arXiv.org Machine LearningFeb-17-2018

Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions. Specifically, we employ an additive variant of maximum mean discrepancy (MMD) for features and introduce a general hypothesis test for PSI. A novel MMD estimator using the incomplete U-statistics, which has an asymptotically Normal distribution (under mild assumptions) and gives high detection power in PSI, is also proposed and analyzed theoretically. Through synthetic and real-world feature selection experiments, we show that the proposed framework can successfully detect statistically significant features. Last, we propose a sample selection framework for analyzing different members in the Generative Adversarial Networks (GANs) family.

artificial intelligence, estimator, machine learning, (13 more...)

arXiv.org Machine Learning

1802.06226

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

Add feedback

Learning Adversarially Fair and Transferable Representations

Madras, David, Creager, Elliot, Pitassi, Toniann, Zemel, Richard

arXiv.org Machine LearningFeb-17-2018

In this work, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream. We envision a scenario where learned representations may be handed off to other entities with unknown objectives. We propose and explore adversarial representation learning as a natural method of ensuring those entities will act fairly, and connect group fairness (demographic parity, equalized odds, and equal opportunity) to different adversarial objectives. Through worst-case theoretical guarantees and experimental validation, we show that the choice of this objective is crucial to fair prediction. Furthermore, we present the first in-depth experimental demonstration of fair transfer learning, by showing that our learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Machine Learning

1802.06309

Country: North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback