Reviews: Kalman Filter, Sensor Fusion, and Constrained Regression: Equivalences and Insights

Neural Information Processing Systems 

Rebuttal acknowledged, thank you for the additional clarifications. Indeed, given a flat prior for x_{t 1} (i.e., Gaussian with "infinite" variance), we have two independent observations: - the influence of the past (prediction term) - the influence of the current measurement (filtering term) both have Gaussian likelihood. So the posterior density of x_{t 1} is proportional to a product of three Gaussian-shaped terms. The two different ways in which these terms can be folded into each other (using standard Gaussian conjugacy rules) lead to Thm 1. I believe that the linear-algebraic formulation the authors use just hides the fact that we are multiplying Gaussian PDFs in different ways.