AITopics | Jaakkola, Tommi

Collaborating Authors

Jaakkola, Tommi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Xu, Yilun, Corso, Gabriele, Jaakkola, Tommi, Vahdat, Arash, Kreis, Karsten

arXiv.org Artificial IntelligenceJul-3-2024

Diffusion models (DMs) have revolutionized generative learning. They utilize a diffusion process to encode data into a simple Gaussian distribution. However, encoding a complex, potentially multimodal data distribution into a single continuous Gaussian distribution arguably represents an unnecessarily challenging learning problem. We propose Discrete-Continuous Latent Variable Diffusion Models (DisCo-Diff) to simplify this task by introducing complementary discrete latent variables. We augment DMs with learnable discrete latents, inferred with an encoder, and train DM and encoder end-to-end. DisCo-Diff does not rely on pre-trained networks, making the framework universally applicable. The discrete latents significantly simplify learning the DM's complex noise-to-data mapping by reducing the curvature of the DM's generative ODE. An additional autoregressive transformer models the distribution of the discrete latents, a simple step because DisCo-Diff requires only few discrete variables with small codebooks. We validate DisCo-Diff on toy data, several image synthesis tasks as well as molecular docking, and find that introducing discrete latents consistently improves model performance. For example, DisCo-Diff achieves state-of-the-art FID scores on class-conditioned ImageNet-64/128 datasets with ODE sampler.

artificial intelligence, disco-diff, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.033

Country:

North America > United States > Virginia (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Recipe for Charge Density Prediction

Fu, Xiang, Rosen, Andrew, Bystrom, Kyle, Wang, Rui, Musaelian, Albert, Kozinsky, Boris, Smidt, Tess, Jaakkola, Tommi

arXiv.org Artificial IntelligenceMay-29-2024

In density functional theory, charge density is the core attribute of atomic systems from which all chemical properties can be derived. Machine learning methods are promising in significantly accelerating charge density prediction, yet existing approaches either lack accuracy or scalability. We propose a recipe that can achieve both. In particular, we identify three key ingredients: (1) representing the charge density with atomic and virtual orbitals (spherical fields centered at atom/virtual coordinates); (2) using expressive and learnable orbital basis sets (basis function for the spherical fields); and (3) using high-capacity equivariant neural network architecture. Our method achieves state-of-the-art accuracy while being more than an order of magnitude faster than existing methods. Furthermore, our method enables flexible efficiency-accuracy trade-offs by adjusting the model/basis sizes.

artificial intelligence, charge density, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.19276

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

In-Context Symmetries: Self-Supervised Learning through Contextual World Models

Gupta, Sharut, Wang, Chenyu, Wang, Yifei, Jaakkola, Tommi, Jegelka, Stefanie

arXiv.org Artificial IntelligenceMay-28-2024

At the core of self-supervised learning for vision is the idea of learning invariant or equivariant representations with respect to a set of data transformations. This approach, however, introduces strong inductive biases, which can render the representations fragile in downstream tasks that do not conform to these symmetries. In this work, drawing insights from world models, we propose to instead learn a general representation that can adapt to be invariant or equivariant to different transformations by paying attention to context -- a memory module that tracks task-specific states, actions, and future states. Here, the action is the transformation, while the current and future states respectively represent the input's representation before and after the transformation. Our proposed algorithm, Contextual Self-Supervised Learning (ContextSSL), learns equivariance to all transformations (as opposed to invariance). In this way, the model can learn to encode all relevant features as general representations while having the versatility to tail down to task-wise symmetries when given a few examples as the context. Empirically, we demonstrate significant performance gains over existing methods on equivariance-related tasks, supported by both qualitative and quantitative evaluations.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.18193

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Add feedback

Verlet Flows: Exact-Likelihood Integrators for Flow-Based Generative Models

Erives, Ezra, Jing, Bowen, Jaakkola, Tommi

arXiv.org Artificial IntelligenceMay-4-2024

Approximations in computing model likelihoods with continuous normalizing flows (CNFs) hinder the use of these models for importance sampling of Boltzmann distributions, where exact likelihoods are required. In this work, we present Verlet flows, a class of CNFs on an augmented state-space inspired by symplectic integrators from Hamiltonian dynamics. When used with carefully constructed Taylor-Verlet integrators, Verlet flows provide exact-likelihood generative models which generalize coupled flow architectures from a non-continuous setting while imposing minimal expressivity constraints. On experiments over toy densities, we demonstrate that the variance of the commonly used Hutchinson trace estimator is unsuitable for importance sampling, whereas Verlet flows perform comparably to full autograd trace computations while being significantly faster.

artificial intelligence, integrator, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.02805

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)
Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Deep Confident Steps to New Pockets: Strategies for Docking Generalization

Corso, Gabriele, Deng, Arthur, Fry, Benjamin, Polizzi, Nicholas, Barzilay, Regina, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-28-2024

Accurate blind docking has the potential to lead to new biological breakthroughs, but for this promise to be realized, docking methods must generalize well across the proteome. Existing benchmarks, however, fail to rigorously assess generalizability. We carefully analyze the scaling laws of ML-based docking and show that, by scaling data and model size, as well as integrating synthetic data strategies, we are able to significantly increase the generalization capacity and set new state-of-the-art performance across benchmarks. Understanding how small molecules and proteins interact, a task known as molecular docking, is at the heart of drug discovery. The conventional use of docking in the industry has led the field to focus on finding binding conformations when restricting the search to predefined pockets and evaluating these on a relatively limited set of protein families of commercial interest. For example, it would help us understand the mechanism of action of new drugs to accelerate their development [Schottlender et al., 2022], predict adverse side-effects of drugs before clinical trials [Luo et al., 2018], and discover the function of the vast number of enzymes and membrane proteins whose biology we do not yet know [Yi et al., 2015]. All these tasks critically require the docking methods to generalize beyond the relatively small class of well-studied proteins for which we have many available structures. Existing docking benchmarks are largely built on collections of similar binding modes and fail to rigorously assess the ability of docking methods to generalize across the proteome. Gathering diverse data for protein-ligand interactions is challenging because binding pockets tend to be evolutionarily well-conserved due to their critical biological functions. Therefore, a large proportion of known interactions fall into a relatively small set of common binding modes. The results show that increasing both data and model can give significant generalization improvements.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.18396

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Dirichlet Flow Matching with Applications to DNA Sequence Design

Stark, Hannes, Jing, Bowen, Wang, Chenyu, Corso, Gabriele, Berger, Bonnie, Barzilay, Regina, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-8-2024

Discrete diffusion or flow models could enable faster and more controllable sequence generation than autoregressive models. We show that na\"ive linear flow matching on the simplex is insufficient toward this goal since it suffers from discontinuities in the training target and further pathologies. To overcome this, we develop Dirichlet flow matching on the simplex based on mixtures of Dirichlet distributions as probability paths. In this framework, we derive a connection between the mixtures' scores and the flow's vector field that allows for classifier and classifier-free guidance. Further, we provide distilled Dirichlet flow matching, which enables one-step sequence generation with minimal performance hits, resulting in $O(L)$ speedups compared to autoregressive models. On complex DNA sequence generation tasks, we demonstrate superior performance compared to all baselines in distributional metrics and in achieving desired design targets for generated sequences. Finally, we show that our classifier-free guidance approach improves unconditional generation and is effective for generating DNA that satisfies design targets. Code is available at https://github.com/HannesStark/dirichlet-flow-matching.

bioinformatics, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.05841

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

AlphaFold Meets Flow Matching for Generating Protein Ensembles

Jing, Bowen, Berger, Bonnie, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-7-2024

The biological functions of proteins often depend on dynamic structural ensembles. In this work, we develop a flow-based generative modeling approach for learning and sampling the conformational landscapes of proteins. We repurpose highly accurate single-state predictors such as AlphaFold and ESMFold and fine-tune them under a custom flow matching framework to obtain sequence-conditoned generative models of protein structure called AlphaFlow and ESMFlow. When trained and evaluated on the PDB, our method provides a superior combination of precision and diversity compared to AlphaFold with MSA subsampling. When further trained on ensembles from all-atom MD, our method accurately captures conformational flexibility, positional distributions, and higher-order ensemble observables for unseen proteins. Moreover, our method can diversify a static PDB structure with faster wall-clock convergence to certain equilibrium properties than replicate MD trajectories, demonstrating its potential as a proxy for expensive physics-based simulations. Code is available at https://github.com/bjing2016/alphaflow.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2402.04845

Country: North America > United States > Massachusetts (0.14)

Genre:

Research Report (0.63)
Workflow (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design

Campbell, Andrew, Yim, Jason, Barzilay, Regina, Rainforth, Tom, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-7-2024

Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data problems. Our key insight is that the discrete equivalent of continuous space flow matching can be realized using Continuous Time Markov Chains. DFMs benefit from a simple derivation that includes discrete diffusion models as a specific instance while allowing improved performance over existing diffusion-based approaches. We utilize our DFMs method to build a multimodal flow-based modeling framework. We apply this capability to the task of protein co-design, wherein we learn a model for jointly generating protein structure and sequence. Our approach achieves state-of-the-art co-design performance while allowing the same multimodal model to be used for flexible generation of the sequence or structure.

machine learning, natural language, rate matrix, (16 more...)

arXiv.org Artificial Intelligence

2402.04997

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sample, estimate, aggregate: A recipe for causal discovery foundation models

Wu, Menghua, Bao, Yujia, Barzilay, Regina, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-2-2024

Causal discovery, the task of inferring causal structure from data, promises to accelerate scientific research, inform policy making, and more. However, the per-dataset nature of existing causal discovery algorithms renders them slow, data hungry, and brittle. Inspired by foundation models, we propose a causal discovery framework where a deep learning model is pretrained to resolve predictions from classical discovery algorithms run over smaller subsets of variables. This method is enabled by the observations that the outputs from classical algorithms are fast to compute for small problems, informative of (marginal) data structure, and their structure outputs as objects remain comparable across datasets. Our method achieves state-of-the-art performance on synthetic and realistic datasets, generalizes to data generating mechanisms not seen during training, and offers inference speeds that are orders of magnitude faster than existing models.

artificial intelligence, machine learning, optimization problem, (21 more...)

arXiv.org Artificial Intelligence

2402.01929

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Correcting Diffusion Generation through Resampling

Liu, Yujian, Zhang, Yang, Jaakkola, Tommi, Chang, Shiyu

arXiv.org Artificial IntelligenceDec-10-2023

Despite diffusion models' superior capabilities in modeling complex distributions, there are still non-trivial distributional discrepancies between generated and ground-truth images, which has resulted in several notable problems in image generation, including missing object errors in text-to-image generation and low image quality. Existing methods that attempt to address these problems mostly do not tend to address the fundamental cause behind these problems, which is the distributional discrepancies, and hence achieve sub-optimal results. In this paper, we propose a particle filtering framework that can effectively address both problems by explicitly reducing the distributional discrepancies. Specifically, our method relies on a set of external guidance, including a small set of real images and a pre-trained object detector, to gauge the distribution gap, and then design the resampling weight accordingly to correct the gap. Experiments show that our methods can effectively correct missing object errors and improve image quality in various image generation tasks. Notably, our method outperforms the existing strongest baseline by 5% in object occurrence and 1.0 in FID on MS-COCO. Our code is publicly available at https://github.com/UCSB-NLP-Chang/diffusion_resampling.git.

diffusion model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2312.06038

Country: North America > United States (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback