Goto

Collaborating Authors

 cig


Contrastive Integrated Gradients: A Feature Attribution-Based Method for Explaining Whole Slide Image Classification

arXiv.org Artificial Intelligence

Interpretability is essential in Whole Slide Image (WSI) analysis for computational pathology, where understanding model predictions helps build trust in AI-assisted diagnostics. While Integrated Gradients (IG) and related attribution methods have shown promise, applying them directly to WSIs introduces challenges due to their high-resolution nature. These methods capture model decision patterns but may overlook class-discriminative signals that are crucial for distinguishing between tumor subtypes. In this work, we introduce Contrastive Integrated Gradients (CIG), a novel attribution method that enhances interpretability by computing contrastive gradients in logit space. First, CIG highlights class-discriminative regions by comparing feature importance relative to a reference class, offering sharper differentiation between tumor and non-tumor areas. Second, CIG satisfies the axioms of integrated attribution, ensuring consistency and theoretical soundness. Third, we propose two attribution quality metrics, MIL-AIC and MIL-SIC, which measure how predictive information and model confidence evolve with access to salient regions, particularly under weak supervision.


Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing

arXiv.org Artificial Intelligence

Common certification methods operate on a flat pre-defined set of fine-grained classes. In this paper, however, we propose a novel, more general, and practical setting, namely adaptive hierarchical certification for image semantic segmentation. In this setting, the certification can be within a multi-level hierarchical label space composed of fine to coarse levels. Unlike classic methods where the certification would abstain for unstable components, our approach adaptively relaxes the certification to a coarser level within the hierarchy. This relaxation lowers the abstain rate whilst providing more certified semantically meaningful information. We mathematically formulate the problem setup and introduce, for the first time, an adaptive hierarchical certification algorithm for image semantic segmentation, that certifies image pixels within a hierarchy and prove the correctness of its guarantees. Since certified accuracy does not take the loss of information into account when traversing into a coarser hierarchy level, we introduce a novel evaluation paradigm for adaptive hierarchical certification, namely the certified information gain metric, which is proportional to the class granularity level. Our evaluation experiments on real-world challenging datasets such as Cityscapes and ACDC demonstrate that our adaptive algorithm achieves a higher certified information gain and a lower abstain rate compared to the current state-of-the-art certification method, as well as other non-adaptive versions of it.


CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem

arXiv.org Artificial Intelligence

The minimal feature removal problem in the post-hoc explanation area aims to identify the minimal feature set (MFS). Prior studies using the greedy algorithm to calculate the minimal feature set lack the exploration of feature interactions under a monotonic assumption which cannot be satisfied in general scenarios. In order to address the above limitations, we propose a Cooperative Integrated Dynamic Refining method (CIDR) to efficiently discover minimal feature sets. Specifically, we design Cooperative Integrated Gradients (CIG) to detect interactions between features. By incorporating CIG and characteristics of the minimal feature set, we transform the minimal feature removal problem into a knapsack problem. Additionally, we devise an auxiliary Minimal Feature Refinement algorithm to determine the minimal feature set from numerous candidate sets. To the best of our knowledge, our work is the first to address the minimal feature removal problem in the field of natural language processing. Extensive experiments demonstrate that CIDR is capable of tracing representative minimal feature sets with improved interpretability across various models and datasets.


This Shanghai Factory Plans to Replace All of Its Human Workers - Motherboard

#artificialintelligence

Here and there, a few people press buttons, turn wrenches, operate handheld scanners, and fold boxes. If the Cambridge Industries Group factory in Shanghai, China seems a little empty, it's on purpose. With robots handling two thirds of the labor, the facility is one of the most automated--thus, worker-free--in the global electronics industry. This factory's on track to become 90 percent automated in coming years. As soon as the technology is available, it will be 100 percent automated, with machines totally replacing human beings. CIG's Shanghai plant offers a preview of a future many government officials and everyday people fear--and which economists warn is increasingly likely as industrial robots rapidly get better and cheaper.


Learning conditional independence structure for high-dimensional uncorrelated vector processes

arXiv.org Machine Learning

We formulate and analyze a graphical model selection method for inferring the conditional independence graph of a high-dimensional nonstationary Gaussian random process (time series) from a finite-length observation. The observed process samples are assumed uncorrelated over time and having a time-varying marginal distribution. The selection method is based on testing conditional variances obtained for small subsets of process components. This allows to cope with the high-dimensional regime, where the sample size can be (drastically) smaller than the process dimension. We characterize the required sample size such that the proposed selection method is successful with high probability.


Learning the Conditional Independence Structure of Stationary Time Series: A Multitask Learning Approach

arXiv.org Machine Learning

E consider a stationary discrete-time vector process or time series. Such a process could model, e.g., the time evolution of air pollutant concentrations [1], [2] or medical diagnostic data obtained in electrocorticography (ECoG) [3]. One specific way of representing the dependence structure of a vector process is via a graphical model [4], where the nodes of the graph represent the individual scalar process components, and the edges represent statistical relations between the individual process components. More precisely, the (undirected) edges of a conditional independence graph (CIG) associated with a process represent conditional independence statements about the process components [4], [1]. In particular, two nodes in the CIG are connected by an edge if and only if the two corresponding process components are conditionally dependent, given the remaining process components. Note that the so defined CIG for time series extends the basic notion of a CIG for random vectors by considering dependencies between entire time series instead of dependencies between scalar random variables [5], [6]. In this work, we investigate the problem of graphical model selection (GMS), i.e., that of inferring the CIG of a time series, given a finite-length observation. A. Jung is with the Institute of Telecommunications, Vienna University of Technology, 1040-Vienna, Austria email: ajung@nt.tuwien.ac.at.


Graphical LASSO Based Model Selection for Time Series

arXiv.org Machine Learning

We propose a novel graphical model selection (GMS) scheme for high-dimensional stationary time series or discrete time process. The method is based on a natural generalization of the graphical LASSO (gLASSO), introduced originally for GMS based on i.i.d. samples, and estimates the conditional independence graph (CIG) of a time series from a finite length observation. The gLASSO for time series is defined as the solution of an l1-regularized maximum (approximate) likelihood problem. We solve this optimization problem using the alternating direction method of multipliers (ADMM). Our approach is nonparametric as we do not assume a finite dimensional (e.g., an autoregressive) parametric model for the observed process. Instead, we require the process to be sufficiently smooth in the spectral domain. For Gaussian processes, we characterize the performance of our method theoretically by deriving an upper bound on the probability that our algorithm fails to correctly identify the CIG. Numerical experiments demonstrate the ability of our method to recover the correct CIG from a limited amount of samples.


Contextual Abductive Reasoning with Side-Effects

arXiv.org Artificial Intelligence

The belief bias effect is a phenomenon which occurs when we think that we judge an argument based on our reasoning, but are actually influenced by our beliefs and prior knowledge. Evans, Barston and Pollard carried out a psychological syllogistic reasoning task to prove this effect. Participants were asked whether they would accept or reject a given syllogism. We discuss one specific case which is commonly assumed to be believable but which is actually not logically valid. By introducing abnormalities, abduction and background knowledge, we adequately model this case under the weak completion semantics. Our formalization reveals new questions about possible extensions in abductive reasoning. For instance, observations and their explanations might include some relevant prior abductive contextual information concerning some side-effect or leading to a contestable or refutable side-effect. A weaker notion indicates the support of some relevant consequences by a prior abductive context. Yet another definition describes jointly supported relevant consequences, which captures the idea of two observations containing mutually supportive side-effects. Though motivated with and exemplified by the running psychology application, the various new general abductive context definitions are introduced here and given a declarative semantics for the first time, and have a much wider scope of application. Inspection points, a concept introduced by Pereira and Pinto, allows us to express these definitions syntactically and intertwine them into an operational semantics.


Compressive Nonparametric Graphical Model Selection For Time Series

arXiv.org Machine Learning

Here, h[m] is a nonnegative weight function that typically increases with m . The CIG of the process x[n] is the graph G: (V, E) with node set V [p]: {1,..., p} representing the scalar component processes {x ABSTRACT We propose a method for inferring the conditional independence graph (CIG) of a high-dimensional discrete-time Gaussian vector random process from finite-length observations. Our approach does not rely on a parametric model (such as, e.g., an autoregressive model) for the vector random process; rather, it only assumes certain spectral smoothness properties. The proposed inference scheme is compressive in that it works for sample sizes that are (much) smaller than the number of scalar process components. We provide analytical conditions for our method to correctly identify the CIG with high probability.