Pattern Recognition and Machine Learning (Bishop) - How is this log-evidence function maximized with respect to $\alpha$?

#artificialintelligence 

So it is not obvious that the additional $\alpha$ dependence of $E (\textbf{m}_N)$ that you point out has vanishing derivative, but there it is, it does. I too was puzzled when I saw no mention of it in the text, or in the solution posted for exercise 3.20 asking to deriver the result, which is therefore rather incomplete. A similar thing happens when maximizing the evidence wrt to $\beta$.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found