Goto

Collaborating Authors

 Accuracy


What is Predictive Model Performance Evaluation DIMENSIONLESS TECHNOLOGIES PVT.LTD.

#artificialintelligence

Evaluation metrics have a correlation with machine learning tasks. The tasks of classification, regression, ranking, clustering, topic modelling, etc, all have different metrics. Some metrics, such as precision, recall, are of use for multiple tasks. Classification, regression, and ranking are examples of supervised learning, which comprises a majority of machine learning applications. In this blog, we'll be focusing on the metrics for supervised learning modules.


Continuous Integration of Machine Learning Models with ease.ml/ci: Towards a Rigorous Yet Practical Treatment

arXiv.org Machine Learning

Continuous integration is an indispensable step of modern software engineering practices to systematically manage the life cycles of system development. Developing a machine learning model is no difference - it is an engineering process with a life cycle, including design, implementation, tuning, testing, and deployment. However, most, if not all, existing continuous integration engines do not support machine learning as first-class citizens. In this paper, we present ease.ml/ci, to our best knowledge, the first continuous integration system for machine learning. The challenge of building ease.ml/ci is to provide rigorous guarantees, e.g., single accuracy point error tolerance with 0.999 reliability, with a practical amount of labeling effort, e.g., 2K labels per test. We design a domain specific language that allows users to specify integration conditions with reliability constraints, and develop simple novel optimizations that can lower the number of labels required by up to two orders of magnitude for test conditions popularly used in real production systems.


On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

arXiv.org Machine Learning

We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.


Machine Learning Series Day 3 (Naive Bayes) โ€“ Becoming Human: Artificial Intelligence Magazine

#artificialintelligence

Intuitively, the idea of a Naive Bayes is how you probably approach life. Like all my articles, I believe that a simple and intuitive understanding of a model should be understood first before diving into the mathematics and practical jargon. Let's say you're responsible for Thanksgiving dinner. You have cooked Thanksgiving dinner for the last ten years. Within those ten years, you have prepared three desserts: pumpkin pie, chocolate cheesecake, and white macadamia cookies.


Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

arXiv.org Machine Learning

We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the smoothly clipped absolute deviation (SCAD) penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory, and associated phase diagrams are given. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the consistency of the theoretical results and the efficiency of the approximate formula and nonconvexity annealing.


AI, Live Video And Your Smartphone Camera

#artificialintelligence

Badri is the Senior Vice President, Technology at Vonage - Video Engineering. As I speak with business leaders from around the world, I'm continually surprised by two important realities that seem to go unnoticed and that are poised to transform the way companies engage with their customers. First, while artificial intelligence (AI) remains a buzzword, many people are still unaware of how advanced algorithms have become. We're not talking about a collaborative filtering algorithm that predicts which Netflix shows you'll want to watch next. Today's algorithms are able to mimic human decision-making on tasks as complex as composing music and predicting what topics are of interest to your Congressional representatives.


Machine Learning Model for Early Sepsis Risk Stratification - Infectious Disease Advisor

#artificialintelligence

A new sepsis screening tool developed using machine learning was timelier and more discriminating than several benchmark screening tools, according to data published in the Annals of Emergency Medicine. The new tool, the Risk of Sepsis (RoS) score, was developed using machine learning and compared with benchmark sepsis-screening tools such as the systemic inflammatory response syndrome, sequential organ failure assessment, quick sequential organ failure assessment, modified early warning score, and national early warning score. Investigators used retrospective electronic health record data from adult patients from 49 urban community hospital emergency departments over a 22-month period to derive and test the model. A total of 2,759,529 records were obtained using the Rhee, et al1 standard for clinical surveillance criteria as the definition of sepsis and the primary target for developing the model. The selection process consisted of 3 stages: (1) existing models for sepsis screening were reviewed, (2) consultation with local subject matter experts, and (3) supervised machine learning called gradient boosting.


Saec: Similarity-Aware Embedding Compression in Recommendation Systems

arXiv.org Machine Learning

Production recommendation systems rely on embedding methods to represent various features. An impeding challenge in practice is that the large embedding matrix incurs substantial memory footprint in serving as the number of features grows over time. We propose a similarity-aware embedding matrix compression method called Saec to address this challenge. Saec clusters similar features within a field to reduce the embedding matrix size. Saec also adopts a fast clustering optimization based on feature frequency to drastically improve clustering time. We implement and evaluate Saec on Numerous, the production distributed machine learning system in Tencent, with 10-day worth of feature data from QQ mobile browser. Testbed experiments show that Saec reduces the number of embedding vectors by two orders of magnitude, compresses the embedding size by ~27x, and delivers the same AUC and log loss performance.


Effect Inference from Two-Group Data with Sampling Bias

arXiv.org Machine Learning

In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we develop an inference method that is resilient to sampling biases and is able to control the false positive errors under moderate bias levels in contrast to the standard approach. We demonstrate the method using synthetic and real biomarker data.


Continual Prediction from EHR Data for Inpatient Acute Kidney Injury

arXiv.org Machine Learning

Acute kidney injury (AKI) commonly occurs in hospitalized patients and can lead to serious medical complications. In order to optimally predict AKI before it develops at any time during a hospital stay, we present a novel framework in which AKI is continually predicted automatically from EHR data over the entire hospital stay instead of at only one particular time. The continual model predicts AKI every time a patients AKI-relevant variable changes in the EHR. Thus the model is not only independent of a particular time for making predictions, but it can also leverage the latest values of all the AKI-relevant patient variables for making predictions. Using data of 44,691 hospital stays of duration longer than 24 hours we evaluated our continual prediction model and compared it with the traditional one-time prediction models. Excluding hospitals stays in which AKI occurred within 24 hours from admission, the one-time prediction model predicting at 24 hours from admission obtained area under ROC curve (AUC) of 0.653 while the continual prediction model obtained AUC of 0.724. The one-time prediction model that predicts at 24 hours obviously cannot predict AKI incidences that occur within 24 hours of admission which when included in the evaluation reduced its AUC to 0.57. In comparison, the continual prediction model had AUC of 0.709. The continual prediction model also did better than all other one-time prediction models predicting at other fixed times. By being able to take into account the latest values of AKI-relevant patient variables and by not being limited to a particular time of prediction, the continual prediction model out-performed one-time prediction models in predicting AKI.