Disentangling Voice and Content with Self-Supervision for Speaker Recognition (Appendix), Kong Aik Lee

Feb-11-2025, 06:28:33 GMT–Neural Information Processing Systems

In this section, we will introduce the simplified method for implementing the proposed Gaussian inference. Similar to [9], we assume that the covariance (and precision) matrices are diagonal and choose to estimate directly the log-precision which turns out to be more convenient for following derivation. As the gain factor A is a diagonal matrix, and z and ϕ are vectors, the expensive matrix multiplication operations and numerically problematic matrix inversions are simplified into element-wise multiplication of diagonal elements and vectors. This is the same as the implementation of point-wise multiplication for matrices in neural networks and thus, is easy to implement based on existing toolkits. The method above can also be applied to layer 1 and layer 3 of the proposed RecXi.

artificial intelligence, machine learning, pattern recognition, (16 more...)

Neural Information Processing Systems

Feb-11-2025, 06:28:33 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.29)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (0.89)
    - Pattern Recognition > Speech Recognition (0.42)
  - Speech (1.00)