Almost every pose estimation algorithm suffers from the problem of jitter during inference. The high-frequency oscillations of keypoints around a point characterize a noisy signal is known as jitter. The jitter cause can be attributed to the fact that we perform these inferences at a frame level for the entire video input. And these consecutive frames have varying occlusion (and a range of complex poses). Another reason can be the inconsistency in the annotations in training data that results in uncertainty in pose estimation.