AITopics | Stojanov, Petar

Plotting

Stojanov, Petar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders

Luo, Gongxu, Dai, Haoyue, Sun, Boyang, Li, Loka, Huang, Biwei, Stojanov, Petar, Zhang, Kun

arXiv.org Artificial IntelligenceJan-17-2025

Gene Regulatory Network Inference (GRNI) aims to identify causal relationships among genes using gene expression data, providing insights into regulatory mechanisms. A significant yet often overlooked challenge is selection bias, a process where only cells meeting specific criteria, such as gene expression thresholds, survive or are observed, distorting the true joint distribution of genes and thus biasing GRNI results. Furthermore, gene expression is influenced by latent confounders, such as non-coding RNAs, which add complexity to GRNI. To address these challenges, we propose GISL (Gene Regulatory Network Inference in the presence of Selection bias and Latent confounders), a novel algorithm to infer true regulatory relationships in the presence of selection and confounding issues. Leveraging data obtained via multiple gene perturbation experiments, we show that the true regulatory relationships, as well as selection processes and latent confounders can be partially identified without strong parametric models and under mild graphical assumptions. Experimental results on both synthetic and real-world single-cell gene expression datasets demonstrate the superiority of GISL over existing methods.

artificial intelligence, latent confounder, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.10124

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Understanding Extrapolation: a Causal Lens

Kong, Lingjing, Chen, Guangyi, Stojanov, Petar, Li, Haoxuan, Xing, Eric P., Zhang, Kun

arXiv.org Machine LearningJan-15-2025

However, practical scenarios often involve only a handful of target samples, potentially lying outside the training support, which requires the capability of extrapolation. In this work, we aim to provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it without requiring an on-support target distribution. To this end, we formulate the extrapolation problem with a latent-variable model that embodies the minimal change principle in causal mechanisms. Under this formulation, we cast the extrapolation problem into a latent-variable identification problem. We provide realistic conditions on shift properties and the estimation objectives that lead to identification even when only one off-support target sample is available, tackling the most challenging scenarios. Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties. We showcase how our theoretical results inform the design of practical adaptation algorithms.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2501.09163

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Causal Representation Learning from Multimodal Biological Observations

Sun, Yuewen, Kong, Lingjing, Chen, Guangyi, Li, Loka, Luo, Gongxu, Li, Zijian, Zhang, Yixuan, Zheng, Yujia, Yang, Mengyue, Stojanov, Petar, Segal, Eran, Xing, Eric P., Zhang, Kun

arXiv.org Artificial IntelligenceNov-10-2024

Prevalent in biological applications (e.g., human phenotype measurements), multimodal datasets can provide valuable insights into the underlying biological mechanisms. However, current machine learning models designed to analyze such datasets still lack interpretability and theoretical guarantees, which are essential to biological applications. Recent advances in causal representation learning have shown promise in uncovering the interpretable latent causal variables with formal theoretical certificates. Unfortunately, existing works for multimodal distributions either rely on restrictive parametric assumptions or provide rather coarse identification results, limiting their applicability to biological research which favors a detailed understanding of the mechanisms. In this work, we aim to develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biological datasets. Theoretically, we consider a flexible nonparametric latent distribution (c.f., parametric assumptions in prior work) permitting causal relationships across potentially different modalities. We establish identifiability guarantees for each latent component, extending the subspace identification results from prior work. Our key theoretical ingredient is the structural sparsity of the causal connections among distinct modalities, which, as we will discuss, is natural for a large collection of biological systems. Empirically, we propose a practical framework to instantiate our theoretical insights. We demonstrate the effectiveness of our approach through extensive experiments on both numerical and synthetic datasets. Results on a real-world human phenotype dataset are consistent with established medical research, validating our theoretical and methodological framework.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.06518

Country: Asia (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Gene Regulatory Network Inference in the Presence of Dropouts: a Causal View

Dai, Haoyue, Ng, Ignavier, Luo, Gongxu, Spirtes, Peter, Stojanov, Petar, Zhang, Kun

arXiv.org Artificial IntelligenceMar-21-2024

Gene regulatory network inference (GRNI) is a challenging problem, particularly owing to the presence of zeros in single-cell RNA sequencing data: some are biological zeros representing no gene expression, while some others are technical zeros arising from the sequencing procedure (aka dropouts), which may bias GRNI by distorting the joint distribution of the measured gene expressions. Existing approaches typically handle dropout error via imputation, which may introduce spurious relations as the true joint distribution is generally unidentifiable. To tackle this issue, we introduce a causal graphical model to characterize the dropout mechanism, namely, Causal Dropout Model. We provide a simple yet effective theoretical result: interestingly, the conditional independence (CI) relations in the data with dropouts, after deleting the samples with zero values (regardless if technical or not) for the conditioned variables, are asymptotically identical to the CI relations in the original data without dropouts. This particular test-wise deletion procedure, in which we perform CI tests on the samples without zeros for the conditioned variables, can be seamlessly integrated with existing structure learning approaches including constraint-based and greedy score-based methods, thus giving rise to a principled framework for GRNI in the presence of dropouts. We further show that the causal dropout model can be validated from data, and many existing statistical models to handle dropouts fit into our model as specific parametric instances. Empirical evaluation on synthetic, curated, and real-world experimental transcriptomic data comprehensively demonstrate the efficacy of our method.

artificial intelligence, dropout mechanism, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.155

Country: North America > United States (0.45)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Hematology (0.67)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Partial Identifiability for Domain Adaptation

Kong, Lingjing, Xie, Shaoan, Yao, Weiran, Zheng, Yujia, Chen, Guangyi, Stojanov, Petar, Akinwande, Victor, Zhang, Kun

arXiv.org Artificial IntelligenceJun-10-2023

Unsupervised domain adaptation is critical to many real-world applications where label information is unavailable in the target domain. In general, without further assumptions, the joint distribution of the features and the label is not identifiable in the target domain. To address this issue, we rely on the property of minimal changes of causal mechanisms across domains to minimize unnecessary influences of distribution shifts. To encode this property, we first formulate the data-generating process using a latent variable model with two partitioned latent subspaces: invariant components whose distributions stay the same across domains and sparse changing components that vary across domains. We further constrain the domain shift to have a restrictive influence on the changing components. Under mild conditions, we show that the latent variables are partially identifiable, from which it follows that the joint distribution of data and labels in the target domain is also identifiable. Given the theoretical insights, we propose a practical domain adaptation framework called iMSDA. Extensive experimental results reveal that iMSDA outperforms state-of-the-art domain adaptation algorithms on benchmark datasets, demonstrating the effectiveness of our framework.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.0651

Country:

North America > United States > Maryland (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback