Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

Singh, Abhishek Kumar (IIIT Hyderabad) | Gupta, Manish (IIIT Hyderabad &amp) | Varma, Vasudeva (Microsoft)

Feb-8-2018–AAAI Conferences

Automated multi-document extractive text summarization is a widely studied research problem in the field of natural language understanding. Such extractive mechanisms compute in some form the worthiness of a sentence to be included into the summary. While the conventional approaches rely on human crafted document-independent features to generate a summary, we develop a data-driven novel summary system called HNet, which exploits the various semantic and compositional aspects latent in a sentence to capture document independent features. The network learns sentence representation in a way that, salient sentences are closer in the vector space than non-salient sentences. This semantic and compositional feature vector is then concatenated with the document-dependent features for sentence ranking. Experiments on the DUC benchmark datasets (DUC-2001, DUC-2002 and DUC-2004) indicate that our model shows significant performance gain of around 1.5-2 points in terms of ROUGE score compared with the state-of-the-art baselines.

artificial intelligence, machine learning, natural language, (18 more...)

AAAI Conferences

Feb-8-2018

Conferences PDF

Add feedback

Country:
- North America > Canada > Quebec (0.14)

Genre:
- Research Report (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found