Leveraging Contextual Information for Effective Entity Salience Detection

Bhowmik, Rajarshi, Ponza, Marco, Tendle, Atharva, Gupta, Anant, Jiang, Rebecca, Lu, Xingyu, Zhao, Qian, Preotiuc-Pietro, Daniel

Sep-14-2023–arXiv.org Artificial Intelligence

In text documents such as news articles, the content and key events usually revolve around a subset of all the entities mentioned in a document. These entities, often deemed as salient entities, provide useful cues of the aboutness of a document to a reader. Identifying the salience of entities was found helpful in several downstream applications such as search, ranking, and entity-centric summarization, among others. Prior work on salient entity detection mainly focused on machine learning models that require heavy feature engineering. We show that fine-tuning medium-sized language models with a cross-encoder style architecture yields substantial performance gains over feature engineering approaches. To this end, we conduct a comprehensive benchmarking of four publicly available datasets using models representative of the medium-sized pre-trained language model family. Additionally, we show that zero-shot prompting of instruction-tuned language models yields inferior results, indicating the task's uniqueness and complexity.

dataset, dunietz and gillick, salience, (13 more...)

arXiv.org Artificial Intelligence

Sep-14-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe
  - Slovenia (0.04)
  - United Kingdom > Scotland
    - City of Edinburgh > Edinburgh (0.04)
  - Sweden > Vaestra Goetaland
    - Gothenburg (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Bulgaria > Sofia City Province
    - Sofia (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - South Korea > Seoul
    - Seoul (0.04)
  - Japan > Honshū
    - Kansai > Osaka Prefecture > Osaka (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language
    - Information Retrieval (0.68)
    - Text Processing (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found