When developing predictive models and algorithms, whether linear regression or ARIMA models it is important to quantify how well the model fits to the future observations. One of the simplest methods of calculating how correct a model is uses the error between the predicted value and the actual value. From there, there are several methodologies that take this difference and further exploit meaning from it. Quantifying the accuracy of an algorithm is an important step to justifying the usage of the algorithm in product. We will be using the function accuracy from the R programming language as our basis.

Machine learning is a pioneer subset of Artificial Intelligence, where Machines learn by itself using the available dataset. For the optimization of any machine learning model, an acceptable loss function must be selected. A Loss function characterizes how well the model performs over the training dataset. Loss functions express the discrepancy between the predictions of the model being trained and also the actual problem instances. If the deviation between predicted result and actual results is too much, then loss function would have a very high value.

After you make predictions, you need to know if they are any good. There are standard measures that we can use to summarize how good a set of predictions actually are. Knowing how good a set of predictions is, allows you to make estimates about how good a given machine learning model of your problem, In this tutorial, you will discover how to implement four standard prediction evaluation metrics from scratch in Python. How to implement and interpret a confusion matrix. How to implement mean absolute error for regression.

Zhang, Yuchen, Wainwright, Martin J., Duchi, John C.

We study two communication-efficient algorithms for distributed statistical optimization on large-scale data. The first algorithm is an averaging method that distributes the $N$ data samples evenly to $m$ machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error that decays as $\order(N {-1} (N/m) {-2})$. Whenever $m \le \sqrt{N}$, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all $N$ samples. The second algorithm is a novel method, based on an appropriate form of the bootstrap.