A Latent Variable Recurrent Neural Network for Discourse Relation Language Models

Ji, Yangfeng, Haffari, Gholamreza, Eisenstein, Jacob

Apr-5-2016–arXiv.org Machine Learning

This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations between adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model can therefore employ a training objective that includes not only discourse relation classification, but also word prediction. As a result, it outperforms state-of- the-art alternatives for two tasks: implicit discourse relation classification in the Penn Discourse Treebank, and dialog act classification in the Switchboard corpus. Furthermore, by marginalizing over latent discourse relations at test time, we obtain a discourse informed language model, which improves over a strong LSTM baseline.

artificial intelligence, discourse relation, machine learning, (16 more...)

arXiv.org Machine Learning

Apr-5-2016

arXiv.org PDF

Add feedback

Country:
- Europe (0.93)
- North America > United States (0.68)
- Asia (0.68)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found