Ontologies
Materializing Knowledge Bases via Trigger Graphs
Tsamoura, Efthymia, Carral, David, Malizia, Enrico, Urbani, Jacopo
The chase is a well-established family of algorithms used to materialize Knowledge Bases (KBs), like Knowledge Graphs (KGs), to tackle important tasks like query answering under dependencies or data cleaning. A general problem of chase algorithms is that they might perform redundant computations. To counter this problem, we introduce the notion of Trigger Graphs (TGs), which guide the execution of the rules avoiding redundant computations. We present the results of an extensive theoretical and empirical study that seeks to answer when and how TGs can be computed and what are the benefits of TGs when applied over real-world KBs. Our results include introducing algorithms that compute (minimal) TGs. We implemented our approach in a new engine, and our experiments show that it can be significantly more efficient than the chase enabling us to materialize KBs with 17B facts in less than 40 min on commodity machines.
Taxonomic survey of Hindi Language NLP systems
Desai, Nikita P., Prof., null, Dabhi, Vipul K.
The field of Natural language processing can be formally defined as - "A theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications"[69]. The naturally occurring text can be in written or spoken form.A wide array of domains contribute to NLP development like linguistics, computer science and psychology.The linguistics field helps to understand the formal structure of language while computer science domain helps to find efficient internal representations and data structures.The study of "Psychology" can be useful to understand the methodology used by humans for dealing with languages. NLP can be considered to be having two distinct focus namely (1)Natural Language Generation(NLG) and (2)Natural Language Understanding(NLU). The NLG deals with planning to use the representation of language to decide what should be generated at each point in interaction, while NLU needs to analyze language and decide which is best way to represent it meaningfully.We, in this survey paper, concentrate on area of NLU for written text.Hence the NLP henceforth might be considered as NLU and vice versa. Motivation for designing Indian NLP systems Hindi and English are the official languages in central government of India(GOI). Indian community faces a "Digital Divide" due to dominance of English as mode of communication in higher education, judiciary, corporate sector and Public administration at Central level whereas the government in states work in their respective regional languages [67].The expansion of Internet has inter-connected the socioeconomic environment of the world and redefined the concept of global culture.As per a report in 2017 by the companies kpmg and Google
Enterprise domain ontology learning from web-based corpus
Vasilateanu, Andrei, Goga, Nicolae, Tanase, Elena-Alice, Marin, Iuliana
Enterprise knowledge is a key asset in the competing and fast-changing corporate landscape. The ability to learn, store and distribute implicit and explicit knowledge can be the difference between success and failure. While enterprise knowledge management is a well-defined research domain, current implementations lack orientation towards small and medium enterprise. We propose a semantic search engine for relevant documents in an enterprise, based on automatic generated domain ontologies. In this paper we focus on the component for ontology learning and population.
Efficiency of Query Evaluation Under Guarded TGDs: The Unbounded Arity Case
The paper analyzes the parameterized complexity of evaluating Ontology Mediated Queries (OMQs) based on Guarded TGDs (GTGDs) and Unions of Conjunctive Queries (UCQs), in the setting where relational symbols might have unbounded arity and where the parameter is the size of the OMQ. It establishes exact criteria for fixed-parameter tractability (fpt) evaluation of recursively enumerable classes of such OMQs (under the widely held Exponential Time Hypothesis). One of the main technical tools introduced in the paper is an fpt-reduction from deciding parameterized uniform CSPs to parameterized OMQ evaluation. A fundamental feature of the reduction is preservation of measures which are known to be essential for classifying classes of parameterized uniform CSPs: submodular width (according to the well known result of Marx for unbounded-arity schemas) and treewidth (according to the well known result of Grohe for bounded-arity schemas). As such, the reduction can be employed to obtain hardness results for evaluation of classes of parameterized OMQs both in the unbounded and in the bounded arity case. Previously, in the case of bounded arity schemas, this has been tackled using a technique requiring full introspection into the construction employed by Grohe.
Privacy Information Classification: A Hybrid Approach
Wu, Jiaqi, Li, Weihua, Bai, Quan, Ito, Takayuki, Moustafa, Ahmed
A large amount of information has been published to online social networks every day. Individual privacy-related information is also possibly disclosed unconsciously by the end-users. Identifying privacy-related data and protecting the online social network users from privacy leakage turn out to be significant. Under such a motivation, this study aims to propose and develop a hybrid privacy classification approach to detect and classify privacy information from OSNs. The proposed hybrid approach employs both deep learning models and ontology-based models for privacy-related information extraction. Extensive experiments are conducted to validate the proposed hybrid approach, and the empirical results demonstrate its superiority in assisting online social network users against privacy leakage.
Enquire One's Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion
Wang, Suyuchen, Zhao, Ruihui, Chen, Xi, Zheng, Yefeng, Liu, Bang
Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure's properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy's hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy's subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy's coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.
Transforming India's Agricultural Sector using Ontology-based Tantra Framework
Food production is a critical activity in which every nation would like to be self-sufficient. India is one of the largest producers of food grains in the world. In India, nearly 70 percent of rural households still depend on agriculture for their livelihood. Keeping farmers happy is particularly important in India as farmers form a large vote bank which politicians dare not disappoint. At the same time, Governments need to balance the interest of farmers with consumers, intermediaries and society at large. The whole agriculture sector is highly information-intensive. Even with enormous collection of data and statistics from different arms of Government, there continue to be information gaps. In this paper we look at how Tantra Social Information Management Framework can help analyze the agricultural sector and transform the same using a holistic approach. Advantage of Tantra Framework approach is that it looks at societal information as a whole without limiting it to only the sector at hand. Tantra Framework makes use of concepts from Zachman Framework to manage aspects of social information through different perspectives and concepts from Unified Foundational Ontology (UFO) to represent interrelationships between aspects. Further, Tantra Framework interoperates with models such as Balanced Scorecard, Theory of Change and Theory of Separations. Finally, we model Indian Agricultural Sector as a business ecosystem and look at approaches to steer transformation from within.
Symbiotic System Design for Safe and Resilient Autonomous Robotics in Offshore Wind Farms
Mitchell, Daniel, Zaki, Osama, Blanche, Jamie, Roe, Joshua, Kong, Leo, Harper, Samuel, Robu, Valentin, Lim, Theodore, Flynn, David
To reduce Operation and Maintenance (O&M) costs on offshore wind farms, wherein 80% of the O&M cost relates to deploying personnel, the offshore wind sector looks to robotics and Artificial Intelligence (AI) for solutions. Barriers to Beyond Visual Line of Sight (BVLOS) robotics include operational safety compliance and resilience, inhibiting the commercialization of autonomous services offshore. To address safety and resilience challenges we propose a symbiotic system; reflecting the lifecycle learning and co-evolution with knowledge sharing for mutual gain of robotic platforms and remote human operators. Our methodology enables the run-time verification of safety, reliability and resilience during autonomous missions. We synchronize digital models of the robot, environment and infrastructure and integrate front-end analytics and bidirectional communication for autonomous adaptive mission planning and situation reporting to a remote operator. A reliability ontology for the deployed robot, based on our holistic hierarchical-relational model, supports computationally efficient platform data analysis. We analyze the mission status and diagnostics of critical sub-systems within the robot to provide automatic updates to our run-time reliability ontology, enabling faults to be translated into failure modes for decision making during the mission. We demonstrate an asset inspection mission within a confined space and employ millimeter-wave sensing to enhance situational awareness to detect the presence of obscured personnel to mitigate risk. Our results demonstrate a symbiotic system provides an enhanced resilience capability to BVLOS missions. A symbiotic system addresses the operational challenges and reprioritization of autonomous mission objectives. This advances the technology required to achieve fully trustworthy autonomous systems.
Indoor Group Activity Recognition using Multi-Layered HMMs
Discovery and recognition of Group Activities (GA) based on imagery data processing have significant applications in persistent surveillance systems, which play an important role in some Internet services. The process is involved with analysis of sequential imagery data with spatiotemporal associations. Discretion of video imagery requires a proper inference system capable of discriminating and differentiating cohesive observations and interlinking them to known ontologies. We propose an Ontology based GAR with a proper inference model that is capable of identifying and classifying a sequence of events in group activities. A multi-layered Hidden Markov Model (HMM) is proposed to recognize different levels of abstract GA. The multi-layered HMM consists of N layers of HMMs where each layer comprises of M number of HMMs running in parallel. The number of layers depends on the order of information to be extracted. At each layer, by matching and correlating attributes of detected group events, the model attempts to associate sensory observations to known ontology perceptions. This paper demonstrates and compares performance of three different implementation of HMM, namely, concatenated N-HMM, cascaded C-HMM and hybrid H-HMM for building effective multi-layered HMM.