In a Boltzmann machine, why isn't there a simple expression for the optimal edge weights in terms of correlations between variables?
If we do this by using gradient ascent on the log-likelihood function, each step of gradient ascent involves an expensive expectation estimate using MCMC (or some cheaper approximation). Conceptually the edge weights represent the "interaction strength" between variables, i.e. $w_{ij}$ represents how much $x_i$ and $x_j$ "want" to be equal. Just looking at the above we can see that when $w_{ij}$ is large and positive, $x_i$ and $x_j$ have a high probability of being equal and the when it's negative they have a higher probability of being opposite sign. What is the relationship between the empirical correlation between each $x_i$ and $x_j$ versus the optimal edge weight $w_{ij}$? It would make sense that variables that are highly positively correlated have large positive edge weights, and variables that are negatively correlated have negative edge weights.
Mar-22-2020, 03:19:42 GMT
- Technology: