Goto

Collaborating Authors

 subobject


Intuitionistic $j$-Do-Calculus in Topos Causal Models

arXiv.org Artificial Intelligence

In this paper, we generalize Pearl's do-calculus to an Intuitionistic setting called $j$-stable causal inference inside a topos of sheaves. Our framework is an elaboration of the recently proposed framework of Topos Causal Models (TCMs), where causal interventions are defined as subobjects. We generalize the original setting of TCM using the Lawvere-Tierney topology on a topos, defined by a modal operator $j$ on the subobject classifier $Ω$. We introduce $j$-do-calculus, where we replace global truth with local truth defined by Kripke-Joyal semantics, and formalize causal reasoning as structure-preserving morphisms that are stable along $j$-covers. $j$-do-calculus is a sound rule system whose premises and conclusions are formulas of the internal Intuitionistic logic of the causal topos. We define $j$-stability for conditional independences and interventional claims as local truth in the internal logic of the causal topos. We give three inference rules that mirror Pearl's insertion/deletion and action/observation exchange, and we prove soundness in the Kripke-Joyal semantics. A companion paper in preparation will describe how to estimate the required entities from data and instantiate $j$-do with standard discovery procedures (e.g., score-based and constraint-based methods), and will include experimental results on how to (i) form data-driven $j$-covers (via regime/section constructions), (ii) compute chartwise conditional independences after graph surgeries, and (iii) glue them to certify the premises of the $j$-do rules in practice


Subobject-level Image Tokenization

arXiv.org Artificial Intelligence

Transformer-based vision models typically tokenize images into fixed-size square patches as input units, which lacks the adaptability to image content and overlooks the inherent pixel grouping structure. Inspired by the subword tokenization widely adopted in language models, we propose an image tokenizer at a subobject level, where the subobjects are represented by semantically meaningful image segments obtained by segmentation models (e.g., segment anything models). To implement a learning system based on subobject tokenization, we first introduced a Direct Segment Anything Model (DirectSAM) that efficiently produces comprehensive segmentation of subobjects, then embed subobjects into compact latent vectors and fed them into a large language model for vision language learning. Empirical results demonstrated that our subobject-level tokenization significantly facilitates efficient learning of translating images into object and attribute descriptions compared to the traditional patch-level tokenization.