Goto

Collaborating Authors

 boltz


Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Richman, Daniel D., Karaguesian, Jessica, Suomivuori, Carl-Mikael, Dror, Ron O.

arXiv.org Artificial Intelligence

The function of biomolecules such as proteins depends on their ability to interconvert between a wide range of structures or "conformations." Researchers have endeavored for decades to develop computational methods to predict the distribution of conformations, which is far harder to determine experimentally than a static folded structure. We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. Our approach upgrades diffusion models -- whether trained for static structure prediction or conformational generation -- to enable more efficient discovery of conformational variability without requiring prior knowledge of major degrees of freedom. ConforMix is orthogonal to improvements in model pretraining and would benefit even a hypothetical model that perfectly reproduced the Boltzmann distribution. Remarkably, when applied to a diffusion model trained for static structure prediction, ConforMix captures structural changes including domain motion, cryptic pocket flexibility, and transporter cycling, while avoiding unphysical states. Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.


Decoding the dark proteome: Deep learning-enabled discovery of druggable enzymes in Wuchereria bancrofti

Shivakumar, Shawnak, Hernandez, Jefferson

arXiv.org Artificial Intelligence

Wuchereria bancrofti, the parasitic roundworm responsible for lymphatic filariasis, permanently disables over 36 million people and places 657 million at risk across 39 countries. A major bottleneck for drug discovery is the lack of functional annotation for more than 90 percent of the W. bancrofti dark proteome, leaving many potential targets unidentified. In this work, we present a novel computational pipeline that converts W. bancrofti's unannotated amino acid sequence data into precise four-level Enzyme Commission (EC) numbers and drug candidates. We utilized a DEtection TRansformer to estimate the probability of enzymatic function, fine-tuned a hierarchical nearest neighbor EC predictor on 4,476 labeled parasite proteins, and applied rejection sampling to retain only four-level EC classifications at 100 percent confidence. This pipeline assigned precise EC numbers to 14,772 previously uncharacterized proteins and discovered 543 EC classes not previously known in W. bancrofti. A qualitative triage emphasizing parasite-specific targets, chemical tractability, biochemical importance, and biological plausibility prioritized six enzymes across five separate strategies: anti-Wolbachia cell-wall inhibition, proteolysis blockade, transmission disruption, purinergic immune interference, and cGMP-signaling destabilization. We curated a 43-compound library from ChEMBL and BindingDB and co-folded across multiple protein conformers with Boltz-2. All six targets exhibited at least moderately strong predicted binding affinities below 1 micromolar, with moenomycin analogs against peptidoglycan glycosyltransferase and NTPase inhibitors showing promising nanomolar hits and well-defined binding pockets. While experimental validation remains essential, our results provide the first large-scale functional map of the W. bancrofti dark proteome and accelerate early-stage drug development for the species.


Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Hu, Jerry Yao-Chieh, Wang, Wei-Po, Gilani, Ammar, Li, Chenyang, Song, Zhao, Liu, Han

arXiv.org Machine Learning

We investigate the statistical and computational limits of prompt tuning for transformer-based foundation models. Our key contributions are prompt tuning on \textit{single-head} transformers with only a \textit{single} self-attention layer: (i) is universal, and (ii) supports efficient (even almost-linear time) algorithms under the Strong Exponential Time Hypothesis (SETH). Statistically, we prove that prompt tuning on such simplest possible transformers are universal approximators for sequence-to-sequence Lipschitz functions. In addition, we provide an exponential-in-$dL$ and -in-$(1/\epsilon)$ lower bound on the required soft-prompt tokens for prompt tuning to memorize any dataset with 1-layer, 1-head transformers. Computationally, we identify a phase transition in the efficiency of prompt tuning, determined by the norm of the \textit{soft-prompt-induced} keys and queries, and provide an upper bound criterion. Beyond this criterion, no sub-quadratic (efficient) algorithm for prompt tuning exists under SETH. Within this criterion, we showcase our theory by proving the existence of almost-linear time prompt tuning inference algorithms. These fundamental limits provide important necessary conditions for designing expressive and efficient prompt tuning methods for practitioners.


Finite-Time Error Analysis of Soft Q-Learning: Switching System Approach

Jeong, Narim, Lee, Donghwan

arXiv.org Artificial Intelligence

Soft Q-learning is a variation of Q-learning designed to solve entropy regularized Markov decision problems where an agent aims to maximize the entropy regularized value function. Despite its empirical success, there have been limited theoretical studies of soft Q-learning to date. This paper aims to offer a novel and unified finite-time, control-theoretic analysis of soft Q-learning algorithms. We focus on two types of soft Q-learning algorithms: one utilizing the log-sum-exp operator and the other employing the Boltzmann operator. By using dynamical switching system models, we derive novel finite-time error bounds for both soft Q-learning algorithms. We hope that our analysis will deepen the current understanding of soft Q-learning by establishing connections with switching system models and may even pave the way for new frameworks in the finite-time analysis of other reinforcement learning algorithms.


Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?

Kajitsuka, Tokio, Sato, Issei

arXiv.org Artificial Intelligence

Existing analyses of the expressive capacity of Transformer models have required excessively deep layers for data memorization, leading to a discrepancy with the Transformers actually used in practice. This is primarily due to the interpretation of the softmax function as an approximation of the hardmax function. By clarifying the connection between the softmax function and the Boltzmann operator, we prove that a single layer of self-attention with low-rank weight matrices possesses the capability to perfectly capture the context of an entire input sequence. As a consequence, we show that one-layer and single-head Transformers have a memorization capacity for finite samples, and that Transformers consisting of one self-attention layer with two feed-forward neural networks are universal approximators for continuous permutation equivariant functions on a compact domain.