Goto

Collaborating Authors

 junction tree


XIMP: Cross Graph Inter-Message Passing for Molecular Property Prediction

Ehrlich, Anatol, Kummer, Lorenz, Voracek, Vojtech, Bause, Franka, Kriege, Nils M.

arXiv.org Machine Learning

Accurate molecular property prediction is central to drug discovery, yet graph neural networks often underperform in data-scarce regimes and fail to surpass traditional fingerprints. We introduce cross-graph inter-message passing (XIMP), which performs message passing both within and across multiple related graph representations. For small molecules, we combine the molecular graph with scaffold-aware junction trees and pharmacophore-encoding extended reduced graphs, integrating complementary abstractions. While prior work is either limited to a single abstraction or non-iterative communication across graphs, XIMP supports an arbitrary number of abstractions and both direct and indirect communication between them in each layer. Across ten diverse molecular property prediction tasks, XIMP outperforms state-of-the-art baselines in most cases, leveraging interpretable abstractions as an inductive bias that guides learning toward established chemical concepts, enhancing generalization in low-data settings.


Leveraging Discrete Function Decomposability for Scientific Design

Bowden, James C., Levine, Sergey, Listgarten, Jennifer

arXiv.org Artificial Intelligence

In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bind its target, arrange components within a circuit to minimize latency, or find materials with certain properties. Given a property predictive model, in silico design typically involves training a generative model over the design space (e.g., protein sequence space) to concentrate on designs with the desired properties. Distributional optimization -- which can be formalized as an estimation of distribution algorithm or as reinforcement learning policy optimization -- finds the generative model that maximizes an objective function in expectation. Optimizing a distribution over discrete-valued designs is in general challenging because of the combinatorial nature of the design space. However, many property predictors in scientific applications are decomposable in the sense that they can be factorized over design variables in a way that could in principle enable more effective optimization. For example, amino acids at a catalytic site of a protein may only loosely interact with amino acids of the rest of the protein to achieve maximal catalytic activity. Current distributional optimization algorithms are unable to make use of such decomposability structure. Herein, we propose and demonstrate use of a new distributional optimization algorithm, Decomposition-Aware Distributional Optimization (DADO), that can leverage any decomposability defined by a junction tree on the design variables, to make optimization more efficient. At its core, DADO employs a soft-factorized "search distribution" -- a learned generative model -- for efficient navigation of the search space, invoking graph message-passing to coordinate optimization across linked factors.


JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation

Shi, Ji, Xie, Chengxun, Li, Zhonghao, Zhang, Xinming, Zhang, Miao

arXiv.org Artificial Intelligence

The discovery of new molecules based on the original chemical molecule distributions is of great importance in medicine. The graph transformer, with its advantages of high performance and scalability compared to traditional graph networks, has been widely explored in recent research for applications of graph structures. However, current transformer-based graph decoders struggle to effectively utilize graph information, which limits their capacity to leverage only sequences of nodes rather than the complex topological structures of molecule graphs. This paper focuses on building a graph transformer-based framework for molecular generation, which we call \textbf{JTreeformer} as it transforms graph generation into junction tree generation. It combines GCN parallel with multi-head attention as the encoder. It integrates a directed acyclic GCN into a graph-based Transformer to serve as a decoder, which can iteratively synthesize the entire molecule by leveraging information from the partially constructed molecular structure at each step. In addition, a diffusion model is inserted in the latent space generated by the encoder, to enhance the efficiency and effectiveness of sampling further. The empirical results demonstrate that our novel framework outperforms existing molecule generation methods, thus offering a promising tool to advance drug discovery (https://anonymous.4open.science/r/JTreeformer-C74C).


Mode Estimation for High Dimensional Discrete Tree Graphical Models

Chao Chen, Han Liu, Dimitris Metaxas, Tianqi Zhao

Neural Information Processing Systems

This paper studies the following problem: given samples from a high dimensional discrete distribution, we want to estimate the leading (δ, ρ)-modes of the underlying distributions. A point is defined to be a (δ, ρ)-mode if it is a local optimum of the density within a δ-neighborhood under metric ρ. As we increase the "scale" parameter δ, the neighborhood size increases and the total number of modes monotonically decreases. The sequence of the (δ, ρ)-modes reveal intrinsic topographical information of the underlying distributions. Though the mode finding problem is generally intractable in high dimensions, this paper unveils that, if the distribution can be approximated well by a tree graphical model, mode characterization is significantly easier. An efficient algorithm with provable theoretical guarantees is proposed and is applied to applications like data analysis and multiple predictions.



Review for NeurIPS paper: Factor Graph Grammars

Neural Information Processing Systems

Clarity: The paper is fairly dense because of the unfortunate 8-page limit, but well and carefully written. I think the most confusing part for readers ls likely to be the conjunction operation -- if there's an extra page in the camera-ready, the presentation here should be slowed down with some qualitative discussion. You should probably clarify early on that you're talking about undirected hypergraphs. Notation in section 2.1: I regard 52-53 as a commutation property, basically vertices(\bar{e}) \bar{vertices(e)}, where \bar lifts from variables or variable-tuples to their labels. I don't understand where the name "att" comes from ("attachment"?) or why you use the name "type" in the way you do.


SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation

Zhou, Xinyi, Li, Xing, Lian, Yingzhao, Wang, Yiwen, Chen, Lei, Yuan, Mingxuan, Hao, Jianye, Chen, Guangyong, Heng, Pheng Ann

arXiv.org Artificial Intelligence

We introduce SeaDAG, a semi-autoregressive diffusion model for conditional generation of Directed Acyclic Graphs (DAGs). Considering their inherent layer-wise structure, we simulate layer-wise autoregressive generation by designing different denoising speed for different layers. Unlike conventional autoregressive generation that lacks a global graph structure view, our method maintains a complete graph structure at each diffusion step, enabling operations such as property control that require the full graph structure. Leveraging this capability, we evaluate the DAG properties during training by employing a graph property decoder. We explicitly train the model to learn graph conditioning with a condition loss, which enhances the diffusion model's capacity to generate graphs that are both realistic and aligned with specified properties. We evaluate our method on two representative conditional DAG generation tasks: (1) circuit generation from truth tables, where precise DAG structures are crucial for realizing circuit functionality, and (2) molecule generation based on quantum properties. Our approach demonstrates promising results, generating high-quality and realistic DAGs that closely align with given conditions.


Generalized Naive Bayes

Kovács, Edith Alice, Ország, Anna, Pfeifer, Dániel, Benczúr, András

arXiv.org Machine Learning

In this paper we introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We give a new greedy algorithm that finds a good fitting Generalized Naive Bayes (GNB) probability distribution. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB). Then, under a not very restrictive condition, we give a second algorithm for which we can prove that it finds the optimal GNB probability distribution, i.e. best fitting structure in the sense of KL divergence. Both algorithms are constructed to maximize the information content and aim to minimize redundancy. Based on these algorithms, new methods for feature selection are introduced. We discuss the similarities and differences to other related algorithms in terms of structure, methodology, and complexity. Experimental results show, that the algorithms introduced outperform the related algorithms in many cases.


The diameter of a stochastic matrix: A new measure for sensitivity analysis in Bayesian networks

Leonelli, Manuele, Smith, Jim Q., Wright, Sophia K.

arXiv.org Artificial Intelligence

Their use as a decision support tool in business and OR has been increasing over the years, including case studies in project management (van Dorp, 2020), supply chain (Garvey et al., 2015), marketing (Hosseini, 2021), and logistics (Qazi, 2022), among others. BNs are defined by two components: a directed acyclic graph (DAG) where each node is a variable of interest and edges represent the, possibly causal, relationship between them; a conditional probability table (CPT) for each node of the DAG reporting the probability distribution of the associated variable conditional on its parents. BNs are highly interpretable due to their graphical nature, representing the probabilistic relationships between variables, making it easy for users to understand and trace the influence of one variable on another. With explainability now recognized as critical for the use of AI in applied research (Rudin, 2019), including in OR (De Bock et al., 2023), BNs stand out by providing transparent and intuitive explanations, thereby enhancing trust and clarity in decision-making processes. The underlying DAG and the associated CPTs can be learned from data using machine learning algorithms or elicited using experts' opinions and knowledge. There is now a vast amount of algorithms to learn BN from data (e.g.


Leveraging Active Subspaces to Capture Epistemic Model Uncertainty in Deep Generative Models for Molecular Design

Abeer, A N M Nafiz, Jantre, Sanket, Urban, Nathan M, Yoon, Byung-Jun

arXiv.org Machine Learning

Deep generative models have been accelerating the inverse design process in material and drug design. Unlike their counterpart property predictors in typical molecular design frameworks, generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters. Specifically, we approximate the posterior distribution over the active subspace parameters to estimate the epistemic model uncertainty in an extremely high dimensional parameter space. The proposed UQ scheme does not require alteration of the model architecture, making it readily applicable to any pre-trained model. Our experiments demonstrate the efficacy of the AS-based UQ and its potential impact on molecular optimization by exploring the model diversity under epistemic uncertainty.