Unsupervised Spectral Learning of FSTs

Mar-13-2024, 22:21:48 GMT–Neural Information Processing Systems

Finite-State Transducers (FST) are a standard tool for modeling paired inputoutput sequences and are used in numerous applications, ranging from computational biology to natural language processing. Recently Balle et al. [4] presented a spectral algorithm for learning FST from samples of aligned input-output sequences. In this paper we address the more realistic, yet challenging setting where the alignments are unknown to the learning algorithm. We frame FST learning as finding a low rank Hankel matrix satisfying constraints derived from observable statistics. Under this formulation, we provide identifiability results for FST distributions. Then, following previous work on rank minimization, we propose a regularized convex relaxation of this objective which is based on minimizing a nuclear norm penalty subject to linear constraints and can be solved efficiently.

hankel matrix, matrix, sequence, (17 more...)

Neural Information Processing Systems

Mar-13-2024, 22:21:48 GMT

Conferences PDF

Add feedback

Country:
- Asia > South Korea (0.04)
- North America > United States
  - New York > New York County > New York City (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Statistical Learning (0.47)