AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Asymptotics of the Empirical Bootstrap Method Beyond Asymptotic Normality

arXiv.org Machine LearningNov-23-2020

One of the most commonly used methods for forming confidence intervals for statistical inference is the empirical bootstrap, which is especially expedient when the limiting distribution of the estimator is unknown. However, despite its ubiquitous role, its theoretical properties are still not well understood for non-asymptotically normal estimators. In this paper, under stability conditions, we establish the limiting distribution of the empirical bootstrap estimator, derive tight conditions for it to be asymptotically consistent, and quantify the speed of convergence. Moreover, we propose three alternative ways to use the bootstrap method to build confidence intervals with coverage guarantees. Finally, we illustrate the generality and tightness of our results by a series of examples, including uniform confidence bands, two-sample kernel tests, minmax stochastic programs and the empirical risk of stacked estimators.

bootstrap method, estimator, sequence, (15 more...)

arXiv.org Machine Learning

2011.11248

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.71)

Add feedback

The Best Machine Learning Algorithm for Handwritten Digits Recognition

#artificialintelligenceNov-22-2020, 21:25:43 GMT

Handwritten Digit Recognition is an interesting machine learning problem in which we have to identify the handwritten digits through various classification algorithms. There are a number of ways and algorithms to recognize handwritten digits, including Deep Learning/CNN, SVM, Gaussian Naive Bayes, KNN, Decision Trees, Random Forests, etc. In this article, we will deploy a variety of machine learning algorithms from the Sklearn's library on our dataset to classify the digits into their categories. We will use Sklearn's load_digits dataset, which is a collection of 8x8 images (64 features)of digits. The dataset contains a total of 1797 sample points.

classification report, classifier, confusion matrix, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.73)

Add feedback

Unfolding the Maths behind Ridge and Lasso Regression!

#artificialintelligenceNov-22-2020, 18:05:34 GMT

This article was published as a part of the Data Science Blogathon. Many times we have come across this statement – Lasso regression causes sparsity while Ridge regression doesn't! But I'm pretty sure that most of us might not have understood how exactly this works. Let's try to understand this using calculus. First, let's understand what sparsity is.

regression, regularization, ridge regression, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)

Add feedback

Cancer image classification based on DenseNet model

Zhong, Ziliang, Zheng, Muhang, Mai, Huafeng, Zhao, Jianan, Liu, Xinyi

arXiv.org Machine LearningNov-22-2020

Computer-aided diagnosis establishes methods for robust assessment of medical image-based examination. Image processing introduced a promising strategy to facilitate disease classification and detection while diminishing unnecessary expenses. In this paper, we propose a novel metastatic cancer image classification model based on DenseNet Block, which can effectively identify metastatic cancer in small image patches taken from larger digital pathology scans. We evaluate the proposed approach to the slightly modified version of the PatchCamelyon (PCam) benchmark dataset. The dataset is the slightly modified version of the PatchCamelyon (PCam) benchmark dataset provided by Kaggle competition, which packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task. The experiments indicated that our model outperformed other classical methods like Resnet34, Vgg19. Moreover, we also conducted data augmentation experiment and study the relationship between Batches processed and loss value during the training and validation process.

cancer, classification, image classification, (12 more...)

arXiv.org Machine Learning

doi: 10.1088/issn.1742-6596

2011.11186

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.96)

Add feedback

Positive and Unlabeled Materials Machine Learning

#artificialintelligenceNov-21-2020, 18:00:45 GMT

Many real-world problems involve datasets where only some of the data is labeled and the rest is unlabeled. In this post, we discuss our implementation of semi-supervised learning for predicting the synthesizability of theoretical materials. When we think about the materials that will enable next-generation technologies, it's probably not the case that there is one ultimate material waiting to be found that will solve all our problems. The problems we need to solve (producing and storing clean energy, mitigating climate change, desalinating water, etc.) are complex and varied. Even zooming in to the next-generation of electronics, computers, and nanotechnology, there probably isn't a single perfect material to exploit in the same way that silicon has been used in all our familiar devices.

compound, synthesizability, unlabeled material machine learning, (12 more...)

#artificialintelligence

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.35)

Add feedback

Things to Keep in Mind Before Applying for Next Data Science Job

#artificialintelligenceNov-20-2020, 20:55:29 GMT

It is now a well-established fact that data science jobs are on an exponential rise. With companies trying to analyze data to gain valuable insights, understand trends and more, data science roles, like data scientists, data engineers, data analysts, analytics specialists, consultants, insights analysts, and more are in high demand than ever. No wonder that Harvard Business Review has named it as the sexiest job of the 21st Century in October 2012. However, preparing for a data science job position can be intimidating. While it is often suggested that the key to crack such an interview is having technical preparation about technology and possessing technological aptitude.

applicant, data science job, interview, (3 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.77)

Add feedback

Seismic Facies Analysis: A Deep Domain Adaptation Approach

Nasim, M Quamer, Maiti, Tannistha, Shrivastava, Ayush, Singh, Tarry, Mei, Jie

arXiv.org Artificial IntelligenceNov-20-2020

Deep neural networks (DNNs) can learn accurately from large quantities of labeled input data, but DNNs sometimes fail to generalize to test data sampled from different input distributions. Unsupervised Deep Domain Adaptation (DDA) proves useful when no input labels are available, and distribution shifts are observed in the target domain (TD). Experiments are performed on seismic images of the F3 block 3D dataset from offshore Netherlands (source domain; SD) and Penobscot 3D survey data from Canada (target domain; TD). Three geological classes from SD and TD that have similar reflection patterns are considered. In the present study, an improved deep neural network architecture named EarthAdaptNet (EAN) is proposed to semantically segment the seismic images. We specifically use a transposed residual unit to replace the traditional dilated convolution in the decoder block. The EAN achieved a pixel-level accuracy >84% and an accuracy of ~70% for the minority classes, showing improved performance compared to existing architectures. In addition, we introduced the CORAL (Correlation Alignment) method to the EAN to create an unsupervised deep domain adaptation network (EAN-DDA) for the classification of seismic reflections fromF3 and Penobscot. Maximum class accuracy achieved was ~99% for class 2 of Penobscot with >50% overall accuracy. Taken together, EAN-DDA has the potential to classify target domain seismic facies classes with high accuracy.

architecture, deep learning, upstream oil & gas, (23 more...)

arXiv.org Artificial Intelligence

2011.1051

Country:

Europe > Netherlands (0.36)
North America > United States (0.28)
North America > Canada > Quebec (0.14)
Asia > India > West Bengal (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Proper Model Selection through Cross Validation

#artificialintelligenceNov-19-2020, 20:31:10 GMT

So, what is cross validation? Recalling my post about model selection, where we saw that it may be necessary to split data into three different portions, one for training, one for validation (to choose among models) and eventually measure the true accuracy through the last data portion. This procedure is one viable way to choose the best among several models. Cross validation (CV) is not too different from this idea, but deals with the model training/validation in quite a smart way. For CV we use a larger combined training and validation data set, followed by a testing dataset.

cross validation, proper model selection, validation, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

Optimizing Approximate Leave-one-out Cross-validation to Tune Hyperparameters

Burn, Ryan

arXiv.org Machine LearningNov-19-2020

For a large class of regularized models, leave-one-out cross-validation can be efficiently estimated with an approximate leave-one-out formula (ALO). We consider the problem of adjusting hyperparameters so as to optimize ALO. We derive efficient formulas to compute the gradient and hessian of ALO and show how to apply a second-order optimizer to find hyperparameters. We demonstrate the usefulness of the proposed approach by finding hyperparameters for regularized logistic regression and ridge regression on various real-world data sets.

hyperparameter, logistic regression, regression, (13 more...)

arXiv.org Machine Learning

2011.10218

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.51)

Industry: Health & Medicine > Therapeutic Area (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.63)

Add feedback

30 Machine Learning Interview Questions With Answers

#artificialintelligenceNov-17-2020, 06:41:06 GMT

Machine Learning interview questions is the essential part of Data Science interview and your path to becoming a Data Scientist. I've divided this guide to machine learning interview questions and answers into the categories so that you can more easily get to the information you need when it comes to machine learning questions. Supervised learning requires training using labelled data. For example, in order to do classification, which is a supervised learning task, you'll first need to label the data you'll use to train the model to classify data into your labelled groups. Unsupervised learning, in divergence, does not require labeling data explicitly.

algorithm, download detailed curriculum, machine learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback