EM-Net: Gaze Estimation with Expectation Maximization Algorithm
Cheng, Zhang, Wang, Yanxia, Xia, Guoyu
–arXiv.org Artificial Intelligence
Appearance-based gaze estimation methods use has gradually improved, but existing methods often full-face or eye images to infer gaze direction. Appearancebased rely on large datasets or large models to improve methods can be categorized into two types based on performance, which leads to high demands on computational the input data: eye image-based [10, 11, 12] and full-face resources. In terms of this issue, this paper proposes image-based [13, 14, 15, 16]. Eye image-based methods a lightweight gaze estimation model EM-Net based on use either monocular or binocular images as input data, and deep learning and traditional machine learning algorithms since the direction of gaze is closely related to the head Expectation Maximization algorithm. First, the proposed pose, these methods usually need to concatenate the head Global Attention Mechanism (GAM) is added to extract pose information after extracting the eye features to compute features related to gaze estimation to improve the model's the final gaze direction. The full-face image-based ability to capture global dependencies and thus improve its methods directly use face images as input, which contains performance. Second, by learning hierarchical feature representations more features related to gaze estimation and can effectively through the EM module, the model has strong improve the performance of the model. Therefore, this generalization ability, which reduces the need for sample method has gradually become a research hotspot.
arXiv.org Artificial Intelligence
Dec-10-2024