End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator
Sun, Guangzhi, Zhang, Chao, Woodland, Philip C.
–arXiv.org Artificial Intelligence
End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In addition to slot filling, the intent classification accuracy was also improved.
arXiv.org Artificial Intelligence
Mar-14-2023
- Country:
- North America
- United States > Rhode Island (0.04)
- Canada
- Ontario > Toronto (0.04)
- Alberta > Census Division No. 6
- Calgary Metropolitan Region > Calgary (0.04)
- Europe
- Greece (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Austria > Styria
- Graz (0.05)
- Asia
- North America
- Genre:
- Research Report (0.64)
- Technology:
- Information Technology > Artificial Intelligence
- Speech > Speech Recognition (1.00)
- Natural Language (1.00)
- Machine Learning
- Neural Networks > Deep Learning (0.47)
- Performance Analysis > Accuracy (0.34)
- Information Technology > Artificial Intelligence