V Inference for Continuous Time Switching Dynamical Systems Supplementary Material

Neural Information Processing Systems 

If it is clear from the context, we will mostly use the favorable uncluttered notation. Using Leibniz' theorem, we have ˆ A.2.1 Calculation of the Filtering Distribution The filtering distribution is defined as α ( y,z,t):= p (y,z,t | x Consider the case where there is no observation in the interval [t,t + h], h > 0. We compute α ( y,z,t + h) = p (y,z,t + h | x A.2.2 Calculation of the Backward Distribution The backward distribution is defined as β (y,z,t):= p( x We find the dynamics of the smoothing distribution by calculating its time derivative. Using the terms in Eq. (28) we have The second term of Eq. (31) does not depend on the Accordingly, both sides of Eq. (35) have Eq. Appendix A.3.4) and provide the gradient with respect to the dispersion A comprehensive overview over the ground-truth and learned parameters is given in Table 2. Note that we utilize this procedure for all experiments.