Goto

Collaborating Authors

Statistical Learning


How to Build an Online Machine Learning App With Python

#artificialintelligence

Machine learning is rapidly becoming as ubiquitous as data itself. Quite literally wherever there is an abundance of data, machine learning is somehow intertwined. After all, what utility would data have if we were not able to use it to predict something about the future? Luckily there is a plethora of toolkits and frameworks that have made it rather simple to deploy ML in Python. Specifically, Sklearn has done a terrifically effective job at making ML accessible to developers.


Papers to Read on Stochastic Gradient Descent

#artificialintelligence

Abstract: We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the largest eigenvalue of the Gram matrix. These facts are referred to as the directional bias properties; they may interpret how an SGD-computed estimator has a potentially smaller generalization error than a GD-computed estimator. The application of our theory is demonstrated by simulation studies and a case study that is based on the FashionMNIST dataset.


Introduction to Machine Learning: K Nearest Neighbors (KNN) - PythonAlgos

#artificialintelligence

K Nearest Neighbors or KNN is a standard Machine Learning algorithm used for classification. In KNN, we plot already labeled points with their label and then define decision boundaries based on the value of the hyperparameter "K". Hyperparameter just means a parameter that we control and can use for tuning. "K" is used to represent how many of the nearest neighbors we should take into account when determining the class of a new point. In this post we'll cover how to do KNN on two datasets, one contrived sample dataset and one more realistic dataset about wine from sklearn.


CO2 emissions dataset in USA: a statistical analysis, using Python

#artificialintelligence

Disclaimer: This notebook has not been written by a climate scientist! Everything is exclusively analyzed by a data scientist point of view. All the statistical analysis are meant to be used as tools for a time series analysis of any kind. Let's start by stating the obvious: The job of a data scientist is to extract insights. The complexity of the tool that you are using is not really relevant.


12 Best Deep Learning Courses on Coursera

#artificialintelligence

This is another specialization program offered by Coursera. This specialization program is for both computer science professionals and healthcare professionals. In this specialization program, you will learn how to identify the healthcare professional's problems that can be solved by machine learning. You will also learn the fundamentals of the U.S. healthcare system, the framework for successful and ethical medical data mining, the fundamentals of machine learning as it applies to medicine and healthcare, and much more. This specialization program has 5 courses. Let's see the details of the courses-


Machine Learning Algorithms Cheat Sheet

#artificialintelligence

Machine learning is a subfield of artificial intelligence (AI) and computer science that focuses on using data and algorithms to mimic the way people learn, progressively improving its accuracy. This way, Machine Learning is one of the most interesting methods in Computer Science these days, and it's being applied behind the scenes in products and services we consume in everyday life. In case you want to know what Machine Learning algorithms are used in different applications, or if you are a developer and you're looking for a method to use for a problem you are trying to solve, keep reading below and use these steps as a guide. Machine Learning can be divided into three different types of learning: Unsupervised Learning, Supervised Learning, and Semi-supervised Learning. Unsupervised learning uses information data that is not labeled, that way the machine should work with no guidance according to patterns, similarities, and differences. On the other hand, supervised learning has a presence of a "teacher", who is in charge of training the machine by labeling the data to work with. Next, the machine receives some examples that allow it to produce a correct outcome.


4 different approaches for Time Series Analysis

#artificialintelligence

The first three approaches exploit differencing to make stationary the time series. Firstly, I import the dataset related to tourists arrivals to Italy from 1990 to 2019 and convert it into a time series. Data are extracted from the European Statistics: Annual Data on Tourism Industries. I use the matplotlib library. Usually, when performing time series analysis, a time series is not split into training and test set, because all the time series is needed to get a good forecast. However, in this tutorial, I split the time series into two parts -- training and test -- in order to test the performance of the tested models.


A Straightforward HPV16 Lineage Classification Based on Machine Learning

#artificialintelligence

Human Papillomavirus (HPV) is the causal agent of 5% of cancers worldwide and the main cause of cervical cancer and it is also associated with a significant percentage of oropharyngeal and anogenital cancers. More than 60% of cervical cancers are caused by HPV16 genotype, which has been classified into lineages (A, B, C, and D). Lineages are related to the progression of cervical cancer and the current method to assess lineages is by building a Maximum Likelihood Tree (MLT); which is slow, it cannot assess poor sequenced samples, and annotation is done manually. In this study, we have developed a new model to assess HPV16 lineage using machine learning tools. A total of 645 HPV16 genomes were analyzed using Genome-Wide Association Study (GWAS), which identified 56 lineage-specific Single Nucleotide Polymorphisms (SNPs). From the SNPs found, training-test models were constructed using different algorithms such as Random Forest (RF), Support Vector Machine (SVM), and K-nearest neighbor (KNN). A distinct set of HPV16 sequences (n = 1,028), whose lineage was previously determined by MLT, was used for validation. The RF-based model allowed a precise assignment of HPV16 lineage, showing an accuracy of 99.5% in the known lineage samples. Moreover, the RF model could assess lineage to 273 samples that MLT could not determine. In terms of computer consuming time, the RF-based model was almost 40 times faster than MLT. Having a fast and efficient method for assigning HPV16 lineages,...


XGBoost: its present-day powers and use cases

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.


What is quantum artificial intelligence? - Dataconomy

#artificialintelligence

Quantum artificial intelligence is here to pave the way for the next chapter of our digital intellect pursuit. Artificial intelligence is a transformative technology, and it needs quantum computing to achieve significant improvement. Although artificial intelligence may be used with conventional computers, it is restricted by conventional computational capabilities. Artificial intelligence's capacity to tackle more complex issues can be enhanced by quantum computing, allowing it to solve much more complicated problems. Quantum artificial intelligence allows quantum computing to be used with machine learning algorithms.