Unsupervised Word Segmentation from Speech with Attention

Godard, Pierre, Zanon-Boito, Marcely, Ondel, Lucas, Berard, Alexandre, Yvon, François, Villavicencio, Aline, Besacier, Laurent

Jun-18-2018–arXiv.org Artificial Intelligence

We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation.

machine learning, natural language, segmentation, (19 more...)

arXiv.org Artificial Intelligence

Jun-18-2018

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- South America > Colombia
  - Meta Department > Villavicencio (0.04)
- North America
  - United States > California
    - San Diego County > San Diego (0.04)
  - Canada > Alberta
    - Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - United Kingdom > England
    - Essex (0.04)
    - East Sussex > Brighton (0.04)
    - Cambridgeshire > Cambridge (0.04)
  - France
    - Hauts-de-France > Nord
      - Lille (0.04)
    - Auvergne-Rhône-Alpes > Isère
      - Grenoble (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Vietnam > Hanoi
    - Hanoi (0.04)
  - Japan > Kyūshū & Okinawa
    - Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Africa > Republic of the Congo
  - Brazzaville > Brazzaville (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Machine Translation (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.46)
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found