Sometimes, you see a diagram and it gives you an'aha ha' moment I saw it on Frederick kratzert's blog Using the input variables x and y, The forwardpass (left half of the figure) calculates output z as a function of x and y i.e. f(x,y) The right side of the figures shows the backwardpass. Receiving dL/dz (the derivative of the total loss with respect to the output z), we can calculate the individual gradients of x and y on the loss function by applying the chain rule, as shown in the figure. This post is a part of my forthcoming book on Mathematical foundations of Data Science. The goal of the neural network is to minimise the loss function for the whole network of neurons. Hence, the problem of solving equations represented by the neural network also becomes a problem of minimising the loss function for the entire network.

A little more than a year ago, a design and technology studio in Boston called Nervous System made an entirely new kind of dress. The garment, a flowing black number with lacy cutouts, was assembled from hundreds of plastic triangles that moved like fabric. Making plastic shift and sway like cotton was crazy enough, but the real mind bender was that the dress emerged from a 3-D printer as a ready-to-wear piece you simply slipped on and buttoned up. The studio, run by Jessica Rosenkrantz and Jesse Louis-Rosenberg, has since made seven more dresses that subtly vary in shape and density of the pattern. "But they all kind of looked a little bit the same," Rosenkrantz says.

Using high-level frameworks like Keras, TensorFlow or PyTorch allows us to build very complex models quickly. However, it is worth taking the time to look inside and understand underlying concepts. Not so long ago I published an article, explaining -- in a simple way -- how neural nets work. However, it was highly theoretical post, dedicated primarily to math, which is the source of NN superpower. From the beginning I was planning to follow-up this topic in a more practical way.

Graph based semi-supervised learning (GSSL) has intuitive representation and can be improved by exploiting the matrix calculation. However, it has to perform iterative optimization to achieve a preset objective, which usually leads to low efficiency. Another inconvenience lying in GSSL is that when new data come, the graph construction and the optimization have to be conducted all over again. We propose a sound assumption, arguing that: the neighboring data points are not in peer-to-peer relation, but in a partial-ordered relation induced by the local density and distance between the data; and the label of a center can be regarded as the contribution of its followers. Starting from the assumption, we develop a highly efficient non-iterative label propagation algorithm based on a novel data structure named as optimal leading forest (LaPOLeaF). The major weaknesses of the traditional GSSL are addressed by this study. We further scale LaPOLeaF to accommodate big data by utilizing block distance matrix technique, parallel computing, and Locality-Sensitive Hashing (LSH). Experiments on large datasets have shown the promising results of the proposed methods.

The distinctive driving force of constraint programming to solve combinatorial problems has been a privileged access to problem structure through the high-level models it uses. From that exposed structure in the form of so-called global constraints, powerful inference algorithms have shared information between constraints by propagating it through shared variables' domains, traditionally by removing unsupported values. Beliefs about individual variable-value assignments are exchanged between contraints and iteratively adjusted. It generalizes standard support propagation and aims to converge to the true marginal distributions of the solutions over individual variables. The necessary architectural changes to a constraint programming solver are described and an empirical study of the proposal is conducted on its implementation.