Checklist

Neural Information Processing Systems 

Themodel outputs the normal distribution for the observations, conditional on hidden stateh(t). Since only some features are observed at atime, we mask out the missing values when calculatingLpre. We denote our predicted distribution withppre,and predicted distribution after updating the state with ppost.