Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities
Ma, Nianzu, Mazumder, Sahisnu, Politowicz, Alexander, Liu, Bing, Robertson, Eric, Grigsby, Scott
–arXiv.org Artificial Intelligence
Much of the existing work on text novelty detection has been studied at the topic level, i.e., identifying whether the topic of a document or a sentence is novel or not. Little work has been done at the fine-grained semantic level (or contextual level). For example, given that we know Elon Musk is the CEO of a technology company, the sentence "Elon Musk acted in the sitcom The Big Bang Theory" is novel and surprising because normally a CEO would not be an actor. Existing topic-based novelty detection methods work poorly on this problem because they do not perform semantic reasoning involving relations between named entities in the text and their background knowledge. This paper proposes an effective model (called PAT-SND) to solve the problem, which can also characterize the novelty. An annotated dataset is also created. Evaluation shows that PAT-SND outperforms 10 baselines by large margins.
arXiv.org Artificial Intelligence
Oct-31-2022
- Country:
- South America > Argentina
- Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- North America
- United States
- Texas (0.04)
- Arizona (0.04)
- Louisiana (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Oklahoma > Payne County
- Cushing (0.04)
- Illinois > Cook County
- Chicago (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- New Hampshire > Hillsborough County
- Nashua (0.04)
- California
- San Francisco County > San Francisco (0.28)
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- New York > New York County
- New York City (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- Ukraine (0.14)
- France (0.14)
- Spain (0.04)
- Russia (0.04)
- Latvia (0.04)
- Eastern Europe (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.14)
- Greece > Attica
- Athens (0.04)
- Sweden
- Uppsala County > Uppsala (0.04)
- Stockholm > Stockholm (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- United Kingdom
- Northern Ireland (0.04)
- England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Africa
- Southern Africa (0.04)
- South Africa > Gauteng
- Pretoria (0.04)
- South America > Argentina
- Genre:
- Personal > Honors (0.93)
- Research Report (0.82)
- Industry:
- Information Technology (0.68)
- Education (0.67)
- Media
- Television (1.00)
- Music (1.00)
- Film (1.00)
- Leisure & Entertainment > Sports
- Motorsports (1.00)
- Baseball (1.00)
- Government
- Military (1.00)
- Voting & Elections (0.93)
- Regional Government > North America Government
- United States Government (1.00)
- Technology: