Collaborating Authors


9 Completely Free Statistics Courses for Data Science


This is a complete Free course for statistics. In this course, you will learn how to estimate parameters of a population using sample statistics, hypothesis testing and confidence intervals, t-tests and ANOVA, correlation and regression, and chi-squared test. This course is taught by industry professionals and you will learn by doing various exercises.

Utilizing variational autoencoders in the Bayesian inverse problem of photoacoustic tomography


Photoacoustic tomography (PAT) is a hybrid biomedical imaging modality based on the photoacoustic effect [6, 44, 32]. In PAT, the imaged target is illuminated with a short pulse of light. Absorption of light creates localized areas of thermal expansion, resulting in localized pressure increases within the imaged target. This pressure distribution, called the initial pressure, relaxes as broadband ultrasound waves that are measured on the boundary of the imaged target. In the inverse problem of PAT, the initial pressure distribution is estimated from a set of measured ultrasound data.

The Application of Machine Learning Techniques for Predicting Match Results in Team Sport: A Review

Journal of Artificial Intelligence Research

Predicting the results of matches in sport is a challenging and interesting task. In this paper, we review a selection of studies from 1996 to 2019 that used machine learning for predicting match results in team sport. Considering both invasion sports and striking/fielding sports, we discuss commonly applied machine learning algorithms, as well as common approaches related to data and evaluation. Our study considers accuracies that have been achieved across different sports, and explores whether evidence exists to support the notion that outcomes of some sports may be inherently more difficult to predict. We also uncover common themes of future research directions and propose recommendations for future researchers. Although there remains a lack of benchmark datasets (apart from in soccer), and the differences between sports, datasets and features makes between-study comparisons difficult, as we discuss, it is possible to evaluate accuracy performance in other ways. Artificial Neural Networks were commonly applied in early studies, however, our findings suggest that a range of models should instead be compared. Selecting and engineering an appropriate feature set appears to be more important than having a large number of instances. For feature selection, we see potential for greater inter-disciplinary collaboration between sport performance analysis, a sub-discipline of sport science, and machine learning.

Bayesian Estimation of Nelson-Siegel model using rjags R package


To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.

Mathematics for Deep Learning (Part 7)


In the road so far, we have talked about MLP, CNN, and RNN architectures. These are discriminative models, that is models that can make predictions. Discriminative models essentially learn to estimate a conditional probability distribution p( x); that is, given a value, they try to predict the outcome based on what they learned about the probability distribution of x. Generative models are architectures of neural networks that learn the probability distribution of the data and learn how to generate data that seems to come from that probability distribution. Creating synthetic data is one use of generative models, but is not the only one.

How is Maximum Likelihood Estimation used in machine learning?


Maximum Likelihood Estimation (MLE) is a probabilistic based approach to determine values for the parameters of the model. Parameters could be defined as blueprints for the model because based on that the algorithm works. MLE is a widely used technique in machine learning, time series, panel data and discrete data. The motive of MLE is to maximize the likelihood of values for the parameter to get the desired outcomes. Following are the topics to be covered.

One Minute Overview of Bayesian Belief Networks


The #52weeksofdatascience newsletter covers everything from Linear Regression to Neural Networks and beyond. So, if you like Data Science and Machine Learning, don't forget to subscribe! Main Idea: Bayesian Belief Network represents a set of variables and their conditional dependencies via a Directed Acyclic Graph (DAG) like the one displayed below. DAG allows us to determine the structure and relationship between different variables explicitly. Everyday use cases: BBN has many use cases, from helping to diagnose diseases to real-time predictions of a race outcome or advising marketing decisions.

Survey and Evaluation of Causal Discovery Methods for Time Series

Journal of Artificial Intelligence Research

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred to as causal discovery in time series. To do so, after a description of the underlying concepts and modelling assumptions, we present different methods according to the family of approaches they belong to: Granger causality, constraint-based approaches, noise-based approaches, score-based approaches, logic-based approaches, topology-based approaches, and difference-based approaches. We then evaluate several representative methods to illustrate the behaviour of different families of approaches. This illustration is conducted on both artificial and real datasets, with different characteristics. The main conclusions one can draw from this survey is that causal discovery in times series is an active research field in which new methods (in every family of approaches) are regularly proposed, and that no family or method stands out in all situations. Indeed, they all rely on assumptions that may or may not be appropriate for a particular dataset.

Bayesian Statistics Overview and your first Bayesian Linear Regression Model


Frequentist and Bayesian are two different versions of statistics. Frequentist is a more classical version, which, as the name suggests, rely on the long run frequency of events (data points) to calculate the variable of interest. Bayesian on the other hand, can also work without having a large number of events (in fact, it could work even with one data point!). The cardinal difference between the two is that: frequentist will give you a point estimate, whereas Bayesian will give you a distribution. Having a point estimate means that -- "we are certain that this is the output for this variable of interest". Whereas, having a distribution can be interpreted as -- "we have some belief that the mean of the distribution is the good estimate for this variable of interest, but there is uncertainty too, in the form of standard deviation".