We describe an approach for identifying fine-grained entity types in heterogeneous data graphs that is effective for unstructured data or when the underlying ontologies or semantic schemas are unknown. Identifying fine-grained entity types, rather than a few high-level types, supports coreference resolution in heterogeneous graphs by reducing the number of possible coreference relations that must be considered. For such cases, we use supervised machine learning to map entity attributes and relations to a known set of attributes and relations from appropriate background knowledge bases to predict instance entity types. We evaluated this approach in experiments on data from DBpedia, Freebase, and Arnetminer using DBpedia as the background knowledge base.
Named Entity Recognition is a process where an algorithm takes a string of text (sentence or paragraph) as input and identifies relevant nouns (people, places, and organizations) that are mentioned in that string. In our previous blog, we gave you a glimpse of how our Named Entity Recognition API works under the hood. In this post, we list some scenarios and use cases of Named Entity Recognition technology.
A personal knowledge graph comprising people as nodes, their personal data as node attributes, and their relationships as edges has a number of applications in de-identification, master data management, and fraud prevention. While artificial neural networks have led to significant improvements in different tasks in cold start knowledge graph population, the overall F1 of the system remains quite low. This problem is more acute in personal knowledge graph population which presents additional challenges with regard to data protection, fairness and privacy. In this work, we present a system that uses rule based annotators to augment training data for neural models, and for slot filling to increase the diversity of the populated knowledge graph. We also propose a representative set sampling method to use the populated knowledge graph data for downstream applications. We introduce new resources and discuss our results.
LinkedIn knowledge graph is a large knowledge base built upon "entities" on LinkedIn, such as members, jobs, titles, skills, companies, geographical locations, schools, etc. These entities and the relationships among them form the ontology of the professional world and are used by LinkedIn to enhance its recommender systems, search, monetization and consumer products, business and consumer analytics. Creating a large knowledge base is a big challenge. Web sites like Wikipedia and Freebase primarily rely on direct contributions from human volunteers. Other related work such as Google's Knowledge Vault and Microsoft's Satori focuses on automatically extracting facts from the Web by leveraging the data redundancy nature of big data for constructing knowledge bases.
In this work, a methodology is developed to detect sentient actors in spoken stories. Meta-tags are then saved to XML files associated with the audio files. A recursive approach is used to find actor candidates and features which are then classified using machine learning approaches. Results of the study indicate that the methodology performed well on a narrative based corpus of children’s stories. Using Support Vector Machines for classification, an F-measure accuracy score of 86% was achieved for both named and unnamed entities. Additionally, feature analysis indicated that speech features were very useful when detecting unnamed actors.