Near-Optimal Smoothing of Structured Conditional Probability Matrices

Nov-21-2025, 15:02:23 GMT–Neural Information Processing Systems

Utilizing the structure of a probabilistic model can significantly increase its learning speed. Motivated by several recent applications, in particular bigram models in language processing, we consider learning low-rank conditional probability matrices under expected KL-risk. This choice makes smoothing, that is the careful handling of low-probability elements, paramount. We derive an iterative algorithm that extends classical non-negative matrix factorization to naturally incorporate additive smoothing and prove that it converges to the stationary points of a penalized empirical risk. We then derive sample-complexity bounds for the global minimizer of the penalized risk and show that it is within a small factor of the optimal sample complexity.

name change, near-optimal smoothing, structured conditional probability matrix, (1 more...)

Neural Information Processing Systems

Nov-21-2025, 15:02:23 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.84)
  - Representation & Reasoning > Uncertainty (0.66)