Object-Oriented Architecture
Object-Oriented Bayesian Networks
Bayesian networks provide a modeling language and associated inference algorithm for stochastic domains. They have been successfully applied in a variety of medium-scale applications. However, when faced with a large complex domain, the task of modeling using Bayesian networks begins to resemble the task of programming using logical circuits. In this paper, we describe an object-oriented Bayesian network (OOBN) language, which allows complex domains to be described in terms of inter-related objects. We use a Bayesian network fragment to describe the probabilistic relations between the attributes of an object. These attributes can themselves be objects, providing a natural framework for encoding part-of hierarchies. Classes are used to provide a reusable probabilistic model which can be applied to multiple similar objects. Classes also support inheritance of model fragments from a class to a subclass, allowing the common aspects of related classes to be defined only once. Our language has clear declarative semantics: an OOBN can be interpreted as a stochastic functional program, so that it uniquely specifies a probabilistic model. We provide an inference algorithm for OOBNs, and show that much of the structural information encoded by an OOBN--particularly the encapsulation of variables within an object and the reuse of model fragments in different contexts--can also be used to speed up the inference process.
Class Algebra for Ontology Reasoning
Buehrer, Daniel, Lee, Chee-Hwa
Class algebra provides a natural framework for sharing of ISA hierarchies between users that may be unaware of each other's definitions. This permits data from relational databases, object-oriented databases, and tagged XML documents to be unioned into one distributed ontology, sharable by all users without the need for prior negotiation or the development of a "standard" ontology for each field. Moreover, class algebra produces a functional correspondence between a class's class algebraic definition (i.e. its "intent") and the set of all instances which satisfy the expression (i.e. its "extent"). The framework thus provides assistance in quickly locating examples and counterexamples of various definitions. This kind of information is very valuable when developing models of the real world, and serves as an invaluable tool assisting in the proof of theorems concerning these class algebra expressions. Finally, the relative frequencies of objects in the ISA hierarchy can produce a useful Boolean algebra of probabilities. The probabilities can be used by traditional information-theoretic classification methodologies to obtain optimal ways of classifying objects in the database.
SPOOK: A System for Probabilistic Object-Oriented Knowledge Representation
Pfeffer, Avi, Koller, Daphne, Milch, Brian, Takusagawa, Ken T.
In previous work, we pointed out the limitations of standard Bayesian networks as a modeling framework for large, complex domains. We proposed a new, richly structured modeling language, {em Object-oriented Bayesian Netorks}, that we argued would be able to deal with such domains. However, it turns out that OOBNs are not expressive enough to model many interesting aspects of complex domains: the existence of specific named objects, arbitrary relations between objects, and uncertainty over domain structure. These aspects are crucial in real-world domains such as battlefield awareness. In this paper, we present SPOOK, an implemented system that addresses these limitations. SPOOK implements a more expressive language that allows it to represent the battlespace domain naturally and compactly. We present a new inference algorithm that utilizes the model structure in a fundamental way, and show empirically that it achieves orders of magnitude speedup over existing approaches.
Integration of knowledge to support automatic object reconstruction from images and 3D data
Boochs, Frank, Marbs, Andreas, Truong, Hung, Hmida, Helmi Ben, Karmacharya, Ashish, Cruz, Christophe, Habed, Adlane, Voisin, Yvon, Nicolle, Christophe
Object reconstruction is an important task in many fields of application as it allows to generate digital representations of our physical world used as base for analysis, planning, construction, visualization or other aims. A reconstruction itself normally is based on reliable data (images, 3D point clouds for example) expressing the object in his complete extent. This data then has to be compiled and analyzed in order to extract all necessary geometrical elements, which represent the object and form a digital copy of it. Traditional strategies are largely based on manual interaction and interpretation, because with increasing complexity of objects human understanding is inevitable to achieve acceptable and reliable results. But human interaction is time consuming and expensive, why many researches has already been invested to use algorithmic support, what allows to speed up the process and to reduce manual work load. Presently most of such supporting algorithms are data-driven and concentate on specific features of the objects, being accessible to numerical models. By means of these models, which normally will represent geometrical (flatness, roughness, for example) or physical features (color, texture), the data is classified and analyzed. This is successful for objects with low complexity, but gets to its limits with increasing complexness of objects. Then purely numerical strategies are not able to sufficiently model the reality. Therefore, the intention of our approach is to take human cognitive strategy as an example, and to simulate extraction processes based on available human defined knowledge for the objects of interest. Such processes will introduce a semantic structure for the objects and guide the algorithms used to detect and recognize objects, which will yield a higher effectiveness. Hence, our research proposes an approach using knowledge to guide the algorithms in 3D point cloud and image processing.
Semantic Kernel Forests from Multiple Taxonomies
Hwang, Sung Ju, Grauman, Kristen, Sha, Fei
When learning features for complex visual recognition problems, labeled image exemplars alone can be insufficient. While an \emph{object taxonomy} specifying the categories' semantic relationships could bolster the learning process, not all relationships are relevant to a given visual classification task, nor does a single taxonomy capture all ties that \emph{are} relevant. In light of these issues, we propose a discriminative feature learning approach that leverages \emph{multiple} hierarchical taxonomies representing different semantic views of the object categories (e.g., for animal classes, one taxonomy could reflect their phylogenic ties, while another could reflect their habitats). For each taxonomy, we first learn a tree of semantic kernels, where each node has a Mahalanobis kernel optimized to distinguish between the classes in its children nodes. Then, using the resulting \emph{semantic kernel forest}, we learn class-specific kernel combinations to select only those relationships relevant to recognize each object class. To learn the weights, we introduce a novel hierarchical regularization term that further exploits the taxonomies' structure. We demonstrate our method on challenging object recognition datasets, and show that interleaving multiple taxonomic views yields significant accuracy improvements.
3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model
Fidler, Sanja, Dickinson, Sven, Urtasun, Raquel
This paper addresses the problem of category-level 3D object detection. Given a monocular image, our aim is to localize the objects in 3D by enclosing them with tight oriented 3D bounding boxes. We propose a novel approach that extends the well-acclaimed deformable part-based model[Felz.] to reason in 3D. Our model represents an object class as a deformable 3D cuboid composed of faces and parts, which are both allowed to deform with respect to their anchors on the 3D box. We model the appearance of each face in fronto-parallel coordinates, thus effectively factoring out the appearance variation induced by viewpoint. Our model reasons about face visibility patters called aspects. We train the cuboid model jointly and discriminatively and share weights across all aspects to attain efficiency. Inference then entails sliding and rotating the box in 3D and scoring object hypotheses. While for inference we discretize the search space, the variables are continuous in our model. We demonstrate the effectiveness of our approach in indoor and outdoor scenarios, and show that our approach outperforms the state-of-the-art in both 2D[Felz09] and 3D object detection[Hedau12].
Localizing 3D cuboids in single-view images
Xiao, Jianxiong, Russell, Bryan, Torralba, Antonio
In this paper we seek to detect rectangular cuboids and localize their corners in uncalibrated single-view images depicting everyday scenes. In contrast to recent approaches that rely on detecting vanishing points of the scene and grouping line segments to form cuboids, we build a discriminative parts-based detector that models the appearance of the cuboid corners and internal edges while enforcing consistency to a 3D cuboid model. Our model is invariant to the different 3D viewpoints and aspect ratios and is able to detect cuboids across many different object categories. We introduce a database of images with cuboid annotations that spans a variety of indoor and outdoor scenes and show qualitative and quantitative results on our collected database. Our model out-performs baseline detectors that use 2D constraints alone on the task of localizing cuboid corners.
Contextually Guided Semantic Labeling and Search for 3D Point Clouds
Anand, Abhishek, Koppula, Hema Swetha, Joachims, Thorsten, Saxena, Ashutosh
RGB-D cameras, which give an RGB image to- gether with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the 3D point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.
Unsurpervised Learning in Hybrid Cognitive Architectures
Vinokurov, Yury (Carnegie Mellon University) | Lebiere, Christian (Carnegie Mellon University) | Wyatte, Dean ( University of Colorado, Boulder ) | Herd, Seth (University of Colorado, Boulder) | O' (University of Colorado, Boulder) | Reilly, Randall
We present a model of unsupervised learning in the hybrid SAL (Synthesis of ACT-R and Leabra) architecture. This model follows the hypothesis that higher evaluative cognitive mechanisms can serve to provide training signals for perceptual learning. This addresses the problem that supervised learning seems necessary for strong perceptual performance, but explicit feedback is rare in the real world and difficult to provide for artificial learning systems. The hybrid model couples the perceptual strengths of Leabra with ACT-R's cognitive mechanisms, specifically its declarative memory, to evolve its own symbolic representations of objects encountered in the world. This is accomplished by presenting the objects to the Leabra visual system and committing the resulting representation to ACT-R's declarative memory. Subsequent presentations are either recalled as instances of a previous object category, in which case the positive association with the representation is rehearsed by Leabra, or they cause ACT-R to generate new category labels, which are also subject to the same rehearsal. The rehearsals drive the network's representations to convergence for a given category; at the same time, rehearsals on the ACT-R side reinforce the chunks that encode the associations between representation and label. In this way, the hybrid model bootstraps itself into learning new categories and their associated features; this framework provides a potential approach to solving the symbol grounding problem. We outline the operations of the hybrid model, evaluate its performance on the CU3D-100 (cu3d.colorado.edu) image set, and discuss further potential improvements to the model, including the integration of motor functions as a way of providing an internal feedback signal to augment and guide a purely bottom-up unsupervised system.
Uncertain and Approximative Knowledge Representation to Reasoning on Classification with a Fuzzy Networks Based System
The approach described here allows to use the fuzzy Object Based Representation of imprecise and uncertain knowledge. This representation has a great practical interest due to the possibility to realize reasoning on classification with a fuzzy semantic network based system. For instance, the distinction between necessary, possible and user classes allows to take into account exceptions that may appear on fuzzy knowledge-base and facilitates integration of user's Objects in the base. This approach describes the theoretical aspects of the architecture of the whole experimental A.I. system we built in order to provide effective on-line assistance to users of new technological systems: the understanding of "how it works" and "how to complete tasks" from queries in quite natural languages. In our model, procedural semantic networks are used to describe the knowledge of an "ideal" expert while fuzzy sets are used both to describe the approximative and uncertain knowledge of novice users in fuzzy semantic networks which intervene to match fuzzy labels of a query with categories from our "ideal" expert.