Goto

Collaborating Authors

Statistical Learning


Scaling assistive healthcare technology with 5G

#artificialintelligence

With recent advances in communication networks and machine learning (ML), healthcare is one of the key application domains which stands to benefit from many opportunities, including remote global healthcare, hospital services on cloud, remote diagnosis or surgeries, among others. One of those advances is network slicing, making it possible to provide high-bandwidth, low-latency and personalized healthcare services for individual users. This is important for patients using healthcare monitoring devices that capture various biological signals (biosignals) such as from the heart (ECG), muscles (EMG), brain (EEG), or activities from other parts of the body. In this blog, we discuss the challenges to building a scalable delivery platform for such connected healthcare services, and how technological advances can help to transform this landscape significantly for the benefit of both users and healthcare service providers. Our specific focus is on assistive technology devices which are increasingly being used by many individuals.


15 Most Common Data Science Interview Questions

#artificialintelligence

Some interviewers ask hard questions while others ask relatively easy questions. As an interviewee, it is your choice to go prepared. And when it comes to a domain like Machine Learning, preparations might fall short. You have to be prepared for everything. While preparing, you might have stuck at a point where you wonder what more shall I read. Well, based on almost 15-17 data science interviews that I have attended, here I have put 15, very commonly asked, as well as important Data Science and Machine Learning related questions that were asked to me in almost all of them and I recommend you must study these thoroughly.


Adding Explainability to Clustering - Analytics Vidhya

#artificialintelligence

Clustering is an unsupervised algorithm that is used for determining the intrinsic groups present in unlabelled data. For instance, a B2C business might be interested in finding segments in its customer base. Clustering is hence used commonly for different use-cases like customer segmentation, market segmentation, pattern recognition, search result clustering etc. Some standard clustering techniques are K-means, DBSCAN, Hierarchical clustering amongst other methods. Clusters created using techniques like Kmeans are often not easy to decipher because it is difficult to determine why a particular row of data is classified in a particular bucket.


How to Explore a Dataset of Images with Graph Theory

#artificialintelligence

When you start working on a dataset that consists of pictures, you'll probably be asked such questions as: can you check if the pictures are good? A quick-and-dirty solution would be to manually look at the data one by one and try to sort them out, but that might be tedious work depending on how many pictures you get. For example, in manufacturing, you could get a sample with thousands of pictures from a production line consisting of batteries of different types and sizes. You'll have to manually go through all pictures and arrange them by type, size, or even color. The other and more efficient option, on the other hand, would be to go the computer vision route and find an algorithm that can automatically arrange and sort your images -- this is the goal of this article. But how can we automate what a person does, i.e. compare pictures two by two with one another and sort them based on similarities?


Day 15–60 days of Data Science and Machine Learning

#artificialintelligence

Hope you all had a great Halloween weekend [ I dressed up as "Mother of Dragons" along with my cool " Game of thrones" techie friends];) #winteriscoming. Let's get back and learn some more data science and machine learning. I hope you all have already grasped the Python essentials, Statistics and Maths from day 1 -- day 8(links shared below), Pandas part 1 and part 2 on Day 9, Day 10, Numpy as Day 11, Data Preprocessing Part 1 as Day 12, Data Preprocessing part 2 as Day 13th, Hands on Regression Part 1 as Day 14th. In this post we will cover how we can implement Regression -- part 2 as Day 15. The Linear Regression method is basically a linear approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) as it just minimizes the least squares error: for one object target y x T * w, where w is model's weights.


Deep Studying with Label Differential Privateness - Channel969

#artificialintelligence

Over the past a number of years, there was an elevated give attention to growing differential privateness (DP) machine studying (ML) algorithms. DP has been the idea of a number of sensible deployments in business -- and has even been employed by the U.S. Census -- as a result of it allows the understanding of system and algorithm privateness ensures. The underlying assumption of DP is that altering a single person's contribution to an algorithm mustn't considerably change its output distribution. In the usual supervised studying setting, a mannequin is educated to make a prediction of the label for every enter given a coaching set of instance pairs {[input1,label1], …, [inputn, labeln]}. Within the case of deep studying, earlier work launched a DP coaching framework, DP-SGD, that was built-in into TensorFlow and PyTorch.


Development and internal validation of a machine-learning-developed model for predicting 1-year mortality after fragility hip fracture - BMC Geriatrics

#artificialintelligence

Fragility hip fracture increases morbidity and mortality in older adult patients, especially within the first year. Identification of patients at high risk of death facilitates modification of associated perioperative factors that can reduce mortality. Various machine learning algorithms have been developed and are widely used in healthcare research, particularly for mortality prediction. This study aimed to develop and internally validate 7 machine learning models to predict 1-year mortality after fragility hip fracture. This retrospective study included patients with fragility hip fractures from a single center (Siriraj Hospital, Bangkok, Thailand) from July 2016 to October 2018. A total of 492 patients were enrolled. They were randomly categorized into a training group (344 cases, 70%) or a testing group (148 cases, 30%). Various machine learning techniques were used: the Gradient Boosting Classifier (GB), Random Forests Classifier (RF), Artificial Neural Network Classifier (ANN), Logistic Regression Classifier (LR), Naive Bayes Classifier (NB), Support Vector Machine Classifier (SVM), and K-Nearest Neighbors Classifier (KNN). All models were internally validated by evaluating their performance and the area under a receiver operating characteristic curve (AUC). For the testing dataset, the accuracies were GB model = 0.93, RF model = 0.95, ANN model = 0.94, LR model = 0.91, NB model = 0.89, SVM model = 0.90, and KNN model = 0.90. All models achieved high AUCs that ranged between 0.81 and 0.99. The RF model also provided a negative predictive value of 0.96, a positive predictive value of 0.93, a specificity of 0.99, and a sensitivity of 0.68. Our machine learning approach facilitated the successful development of an accurate model to predict 1-year mortality after fragility hip fracture. Several machine learning algorithms (eg, Gradient Boosting and Random Forest) had the potential to provide high predictive performance based on the clinical parameters of each patient. The web application is available at www.hipprediction.com . External validation in a larger group of patients or in different hospital settings is warranted to evaluate the clinical utility of this tool. Thai Clinical Trials Registry (22 February 2021; reg. no. TCTR20210222003 ).


Know About Ensemble Methods in Machine Learning - Analytics Vidhya

#artificialintelligence

This article was published as a part of the Data Science Blogathon. The variance is the difference between the model and the ground truth value, whereas the error is the outcome of sensitivity to tiny perturbations in the training set. Excessive bias might cause an algorithm to miss unique relationships between the intended outputs and the features (underfitting). There is a high variance in the algorithm that models random noise in the training data (overfitting). The bias-variance tradeoff is a characteristic of a model that states to lower the bias in estimated parameters, the variance of the parameter estimated across samples has increased.


Best Papers to Read on the Mean Shift Algorithm

#artificialintelligence

Abstract: Two important nonparametric approaches to clustering emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hosteler. In a recent paper, we argue the thesis that these two approaches are fundamentally the same by showing that the gradient flow provides a way to move along the cluster tree. In making a stronger case, we are confronted with the fact the cluster tree does not define a partition of the entire support of the underlying density, while the gradient flow does. Abstract: Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion.


Research Papers based on Lasso Regression part2(Machine Learning)

#artificialintelligence

Abstract: The application of the lasso is espoused in high-dimensional settings where only a small number of the regression coefficients are believed to be nonzero. Moreover, statistical properties of high-dimensional lasso estimators are often proved under the assumption that the correlation between the predictors is bounded. In this vein, coordinatewise methods, the most common means of computing the lasso solution, work well in the presence of low to moderate multicollinearity. Motivated by these limitations, we propose the novel "Deterministic Bayesian Lasso" algorithm for computing the lasso solution. This algorithm is developed by considering a limiting version of the Bayesian lasso.