Goto

Collaborating Authors

 Shell, Dylan A.


Helping Novices Avoid the Hazards of Data: Leveraging Ontologies to Improve Model Generalization Automatically with Online Data Sources

AI Magazine

The infrastructure and tools necessary for large-scale data analytics, formerly the exclusive purview of experts, are increasingly available. Whereas a knowledgeable data-miner or domain expert can rightly be expected to exercise caution when required (for example, around fallacious conclusions supposedly supported by the data), the nonexpert may benefit from some judicious assistance. This article describes an end-to-end learning framework that allows a novice to create models from data easily by helping structure the model building process and capturing extended aspects of domain knowledge. By treating the whole modeling process interactively and exploiting high-level knowledge in the form of an ontology, the framework is able to aid the user in a number of ways, including in helping to avoid pitfalls such as data dredging. Prudence must be exercised to avoid these hazards as certain conclusions may only be supported if, for example, there is extra knowledge which gives reason to trust a narrower set of hypotheses. This article adopts the solution of using higher-level knowledge to allow this sort of domain knowledge to be used automatically, selecting relevant input attributes, and thence constraining the hypothesis space. We describe how the framework automatically exploits structured knowledge in an ontology to identify relevant concepts, and how a data extraction component can make use of online data sources to find measurements of those concepts so that their relevance can be evaluated. To validate our approach, models of four different problem domains were built using our implementation of the framework. Prediction error on unseen examples of these models show that our framework, making use of the ontology, helps to improve model generalization.


Leveraging Ontologies to Improve Model Generalization Automatically with Online Data Sources

AAAI Conferences

This paper describes an end-to-end learning framework that allows a novice to create a model from data easily by helping structure the model building process and capturing extended aspects of domain knowledge. By treating the whole modeling process interactively and exploiting high-level knowledge in the form of an ontology, the framework is able to aid the user in a number of ways, including in helping to avoid pitfalls such as data dredging. Prudence must be exercised to avoid these hazards: certain conclusions may be supported by extra knowledge if, for example, there are reasons to trust a particular narrower set of hypotheses. This paper adopts the solution of using higher-level knowledge in order to allow this sort of domain knowledge to be inferred automatically, thereby selecting only relevant input attributes and thence constraining the hypothesis space. We describe how the framework automatically exploits structured knowledge in an ontology to identify relevant concepts, and how a data extraction component can make use of online data sources to find measurements of those concepts so that their relevance can be evaluated. To validate our approach, models of four different problem domains were built using our implementation of the framework. Prediction error on unseen examples of these models show that our framework, making use of the ontology, helps to improve model generalization.


Being There, Being the RRT: Space-Filling and Searching in Place with Minimalist Robots

AAAI Conferences

Inspired by the Rapidly Exploring Random Tree data-structure and algorithm for path planning in high-dimensional, continuous spaces, we consider an approach for spanning a space with a group of simple robots. We employ a minimalist approach in which InfraRed and contact sensors form the primary means of communication; the agents physically embody the elements of the tree through their position and other agents can either follow the tree to useful locations or expand the tree by becoming part of it. Although robots are constrained in some of the operations they may perform in space, we argue that our approach remains consistent with the original data-structure. We demonstrate that one may perform a planning query from a point to the tree origin directly via message passing where passing involves direct physical motion or simple IR messages. Based on the work done by Werger and Matarić , our implementation proves that it is possible to form and maintain a RRT using simple position unaware robots. The work is important because it demonstrates that decentralized path planning can be performed by simple agents using purely reactive behaviors and at the same time poses significant challenges to keep the shape of the tree intact.