The X Types -- Mapping the Semantics of the Twitter Sphere
Drukerman, Ogen Schlachet, Minkov, Einat
–arXiv.org Artificial Intelligence
Social networks form a valuable source of world knowledge, where influential entities correspond to popular accounts. Unlike factual knowledge bases (KBs), which maintain a semantic ontology, structured semantic information is not available on social media. In this work, we consider a social KB of roughly 200K popular Twitter accounts, which denotes entities of interest. We elicit semantic information about those entities. In particular, we associate them with a fine-grained set of 136 semantic types, e.g., determine whether a given entity account belongs to a politician, or a musical artist. In the lack of explicit type information in Twitter, we obtain semantic labels for a subset of the accounts via alignment with the KBs of DBpedia and Wikidata. Given the labeled dataset, we finetune a transformer-based text encoder to generate semantic embeddings of the entities based on the contents of their accounts. We then exploit this evidence alongside network-based embeddings to predict the entities semantic types. In our experiments, we show high type prediction performance on the labeled dataset. Consequently, we apply our type classification model to all of the entity accounts in the social KB. Our analysis of the results offers insights about the global semantics of the Twitter sphere. We discuss downstream applications that should benefit from semantic type information and the semantic embeddings of social entities generated in this work. In particular, we demonstrate enhanced performance on the key task of entity similarity assessment using this information.
arXiv.org Artificial Intelligence
Sep-22-2024
- Country:
- Asia > Middle East
- Israel > Haifa District > Haifa (0.04)
- North America > United States
- California
- Los Angeles County > Los Angeles (0.04)
- Santa Clara County > Palo Alto (0.04)
- New York > New York County
- New York City (0.04)
- California
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Aerospace & Defense (0.93)
- Education (0.93)
- Government > Regional Government
- Information Technology > Services (1.00)
- Leisure & Entertainment > Sports
- Soccer (0.46)
- Media
- Transportation (0.93)
- Technology: