wikipedia concept
Classification of news spreading barriers
Sittar, Abdul, Mladenic, Dunja, Grobelnik, Marko
News media is one of the most effective mechanisms for spreading information internationally, and many events from different areas are internationally relevant. However, news coverage for some news events is limited to a specific geographical region because of information spreading barriers, which can be political, geographical, economic, cultural, or linguistic. In this paper, we propose an approach to barrier classification where we infer the semantics of news articles through Wikipedia concepts. To that end, we collected news articles and annotated them for different kinds of barriers using the metadata of news publishers. Then, we utilize the Wikipedia concepts along with the body text of news articles as features to infer the news-spreading barriers. We compare our approach to the classical text classification methods, deep learning, and transformer-based methods. The results show that the proposed approach using Wikipedia concepts based semantic knowledge offers better performance than the usual for classifying the news-spreading barriers.
- Asia > Middle East > Israel (0.04)
- Africa > Nigeria (0.04)
- Europe > Germany (0.04)
- (25 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)
- Health & Medicine > Therapeutic Area > Immunology (0.67)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Towards Proactive Information Retrieval in Noisy Text with Wikipedia Concepts
Ahmed, Tabish, Bulathwela, Sahan
Extracting useful information from the user history to clearly understand informational needs is a crucial feature of a proactive information retrieval system. Regarding understanding information and relevance, Wikipedia can provide the background knowledge that an intelligent system needs. This work explores how exploiting the context of a query using Wikipedia concepts can improve proactive information retrieval on noisy text. We formulate two models that use entity linking to associate Wikipedia topics with the relevance model. Our experiments around a podcast segment retrieval task demonstrate that there is a clear signal of relevance in Wikipedia concepts while a ranking model can improve precision by incorporating them. We also find Wikifying the background context of a query can help disambiguate the meaning of the query, further helping proactive information retrieval.
- North America > United States > Maryland (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > Dominican Republic (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
12 new Project Debater AI technologies available as cloud APIs
Argumentation and debating are fundamental capabilities of our human intelligence. Until recently, they have been totally out of reach of AI. In February 2019 and after six years of work by natural language processing and machine learning researchers and engineers, an IBM AI dubbed Project Debater became the first AI system able to debate humans over complex topics. And while it may not have'won' the sparring against debate champion Harish Natarajan in San Francisco that year, Project Debater demonstrated how AI could help people build persuasive arguments and make well-informed decisions. The AI became the third in the series of IBM Research AI's grand challenges, following Deep Blue and Watson.
Recovering Concept Prerequisite Relations from University Course Dependencies
Liang, Chen (Pennsylvania State University) | Ye, Jianbo (Pennsylvania State University) | Wu, Zhaohui (Microsoft Corporation) | Pursel, Bart (Pennsylvania State University) | Giles, C. Lee (Pennsylvania State University)
Prerequisite relations among concepts play an important role in many educational applications such as intelligent tutoring system and curriculum planning. With the increasing amount of educational data available, automatic discovery of concept prerequisite relations has become both an emerging research opportunity and an open challenge. Here, we investigate how to recover concept prerequisite relations from course dependencies and propose an optimization based framework to address the problem. We create the first real dataset for empirically studying this problem, which consists of the listings of computer science courses from 11 U.S. universities and their concept pairs with prerequisite labels. Experiment results on a synthetic dataset and the real course dataset both show that our method outperforms existing baselines.
- Education > Educational Technology > Educational Software > Computer Based Training (0.86)
- Education > Curriculum (0.85)
- Education > Educational Setting (0.83)
Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia
Wu, Zhaohui (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)
Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific'' prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluations on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > Pennsylvania > Centre County > University Park (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Transportation > Ground (0.46)
- Banking & Finance > Trading (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Wikipedia-Based Distributional Semantics for Entity Relatedness
Aggarwal, Nitish (National University of Ireland, Galway) | Buitelaar, Paul (National University of Ireland, Galway)
Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Distributional Semantics for Entity Relatedness (DiSER), which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DiSER measures the semantic relatedness between two entities by quantifying the distance between the corresponding high-dimensional vectors. DiSER builds the model by taking the annotated entities only, therefore it improves over existing approaches, which do not distinguish between an entity and its surface form. We evaluate the approach on a benchmark that contains the relative entity relatedness scores for 420 entity pairs. Our approach improves the accuracy by 12% on state of the art methods for computing entity relatedness. We also show an evaluation of DiSER in the Entity Disambiguation task on a dataset of 50 sentences with highly ambiguous entity mentions. It shows an improvement of 10% in precision over the best performing methods. In order to provide the resource that can be used to find out all the related entities for a given entity, a graph is constructed, where the nodes represent Wikipedia entities and the relatedness scores are reflected by the edges. Wikipedia contains more than 4.1 millions entities, which required efficient computation of the relatedness scores between the corresponding 17 trillions of entity-pairs.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Leisure & Entertainment (0.94)
- Information Technology (0.94)
- Media (0.69)
A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogosphere
Tang, Jintao (National University of Defense Technology) | Wang, Ting (National University of Defense Technology) | Lu, Qin (Hong Kong Polytechnic University) | Wang, Ji (National Laboratory for Parallel and Distributed Processing) | Li, Wenjie (Hong Kong Polytechnic University)
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the named entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through a graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested using real-world blog data. Experimental results show the advantage of the proposed method on tracking topics in short, noisy text.
- Asia > China > Hong Kong (0.04)
- Africa > Southern Africa (0.04)
- Oceania > Australia (0.04)
- (8 more...)