glide
Molecular Embedding-Based Algorithm Selection in Protein-Ligand Docking
Wang, Jiabao Brad, Cao, Siyuan, Wu, Hongxuan, Yuan, Yiliang, Misir, Mustafa
Selecting an effective docking algorithm is highly context-dependent, and no single method performs reliably across structural, chemical, or protocol regimes. We introduce MolAS, a lightweight algorithm selection system that predicts per-algorithm performance from pretrained protein-ligand embeddings using attentional pooling and a shallow residual decoder. With only hundreds to a few thousand labelled complexes, MolAS achieves up to 15% absolute improvement over the single-best solver (SBS) and closes 17-66% of the Virtual Best Solver (VBS)-SBS gap across five diverse docking benchmarks. Analyses of reliability, embedding geometry, and solver-selection patterns show that MolAS succeeds when the oracle landscape exhibits low entropy and separable solver behaviour, but collapses under protocol-induced hierarchy shifts. These findings indicate that the main barrier to robust docking AS is not representational capacity but instability in solver rankings across pose-generation regimes, positioning MolAS as both a practical in-domain selector and a diagnostic tool for assessing when AS is feasible.
GLIDE: A Coordinated Aerial-Ground Framework for Search and Rescue in Unknown Environments
Farrell, Seth, Li, Chenghao, Yu, Hongzhan, Mojtahedi, Hesam, Gao, Sicun, Christensen, Henrik I.
Abstract-- We present a cooperative aerial-ground search-and-rescue (SAR) framework that pairs two unmanned aerial vehicles (UA Vs) with an unmanned ground vehicle (UGV) to achieve rapid victim localization and obstacle-aware navigation in unknown environments. In our framework, a goal-searching UA V executes real-time onboard victim detection and georeferencing to nominate goals for the ground platform, while a terrain-scouting UA V flies ahead of the UGV's planned route to provide mid-level traversability updates. The UGV fuses aerial cues with local sensing to perform time-efficient A* planning and continuous replanning as information arrives. Additionally, we present a hardware demonstration (using a GEM e6 golf cart as the UGV and two X500 UA Vs) to evaluate end-to-end SAR mission performance and include simulation ablations to assess the planning stack in isolation from detection. Empirical results demonstrate that explicit role separation across UA Vs, coupled with terrain scouting and guided planning, improves reach time and navigation safety in time-critical SAR missions. Search and rescue (SAR) operations stand to benefit from recent advances in autonomous aerial and ground robotics. Unmanned Aerial V ehicles (UA Vs) enable rapid, large-area coverage due to their agility and mobility. The adoption of drones across civilian and military applications has highlighted advantages in speed and perspective.
Deep-Learning Based Docking Methods: Fair Comparisons to Conventional Docking Workflows
Jain, Ajay N., Cleves, Ann E., Walters, W. Patrick
The diffusion learning method, DiffDock, for docking small-molecule ligands into protein binding sites was recently introduced. Results included comparisons to more conventional docking approaches, with DiffDock showing superior performance. Here, we employ a fully automatic workflow using the Surflex-Dock methods to generate a fair baseline for conventional docking approaches. Results were generated for the common and expected situation where a binding site location is known and also for the condition of an unknown binding site. For the known binding site condition, Surflex-Dock success rates at 2.0 Angstroms RMSD far exceeded those for DiffDock (Top-1/Top-5 success rates, respectively, were 68/81% compared with 45/51%). Glide performed with similar success rates (67/73%) to Surflex-Dock for the known binding site condition, and results for AutoDock Vina and Gnina followed this pattern. For the unknown binding site condition, using an automated method to identify multiple binding pockets, Surflex-Dock success rates again exceeded those of DiffDock, but by a somewhat lesser margin. DiffDock made use of roughly 17,000 co-crystal structures for learning (98% of PDBBind version 2020, pre-2019 structures) for a training set in order to predict on 363 test cases (2% of PDBBind 2020) from 2019 forward. DiffDock's performance was inextricably linked with the presence of near-neighbor cases of close to identical protein-ligand complexes in the training set for over half of the test set cases. DiffDock exhibited a 40 percentage point difference on near-neighbor cases (two-thirds of all test cases) compared with cases with no near-neighbor training case. DiffDock has apparently encoded a type of table-lookup during its learning process, rendering meaningful applications beyond its reach. Further, it does not perform even close to competitively with a competently run modern docking workflow.
Planning-Guided Diffusion Policy Learning for Generalizable Contact-Rich Bimanual Manipulation
Li, Xuanlin, Zhao, Tong, Zhu, Xinghao, Wang, Jiuguang, Pang, Tao, Fang, Kuan
Contact-rich bimanual manipulation involves precise coordination of two arms to change object states through strategically selected contacts and motions. Due to the inherent complexity of these tasks, acquiring sufficient demonstration data and training policies that generalize to unseen scenarios remain a largely unresolved challenge. Building on recent advances in planning through contacts, we introduce Generalizable Planning-Guided Diffusion Policy Learning (GLIDE), an approach that effectively learns to solve contact-rich bimanual manipulation tasks by leveraging model-based motion planners to generate demonstration data in high-fidelity physics simulation. Through efficient planning in randomized environments, our approach generates large-scale and high-quality synthetic motion trajectories for tasks involving diverse objects and transformations. We then train a task-conditioned diffusion policy via behavior cloning using these demonstrations. To tackle the sim-to-real gap, we propose a set of essential design options in feature extraction, task representation, action prediction, and data augmentation that enable learning robust prediction of smooth action sequences and generalization to unseen scenarios. Through experiments in both simulation and the real world, we demonstrate that our approach can enable a bimanual robotic system to effectively manipulate objects of diverse geometries, dimensions, and physical properties. Website: https://glide-manip.github.io/
Grounding Language Plans in Demonstrations Through Counterfactual Perturbations
Wang, Yanwei, Wang, Tsun-Hsuan, Mao, Jiayuan, Hagenow, Michael, Shah, Julie
Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI. Whereas prior works have focused on leveraging LLMs directly for planning in symbolic spaces, this work uses LLMs to guide the search of task structures and constraints implicit in multi-step demonstrations. Specifically, we borrow from manipulation planning literature the concept of mode families, which group robot configurations by specific motion constraints, to serve as an abstraction layer between the high-level language representations of an LLM and the low-level physical trajectories of a robot. By replaying a few human demonstrations with synthetic perturbations, we generate coverage over the demonstrations' state space with additional successful executions as well as counterfactuals that fail the task. Our explanation-based learning framework trains an end-to-end differentiable neural network to predict successful trajectories from failures and as a by-product learns classifiers that ground low-level states and images in mode families without dense labeling. The learned grounding classifiers can further be used to translate language plans into reactive policies in the physical domain in an interpretable manner. We show our approach improves the interpretability and reactivity of imitation learning through 2D navigation and simulated and real robot manipulation tasks. Website: https://yanweiw.github.io/glide
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Po-Yuan, Mao, Kotyan, Shashank, Foong, Tham Yik, Vargas, Danilo Vasconcellos
Recent advances in Conditional Diffusion Models have led to substantial capabilities in various domains. However, understanding the impact of variations in the initial seed vector remains an underexplored area of concern. Particularly, latent-based diffusion models display inconsistencies in image generation under standard conditions when initialized with suboptimal initial seed vectors. To understand the impact of the initial seed vector on generated samples, we propose a reliability evaluation framework that evaluates the generated samples of a diffusion model when the initial seed vector is subjected to various synthetic shifts. Our results indicate that slight manipulations to the initial seed vector of the state-of-the-art Stable Diffusion (Rombach et al., 2022) can lead to significant disturbances in the generated samples, consequently creating images without the effect of conditioning variables. In contrast, GLIDE (Nichol et al., 2022) stands out in generating reliable samples even when the initial seed vector is transformed.
The Legend of Zelda: Tears of the Kingdom review – pure magic
Since I first hit start on Tears of the Kingdom two weeks ago, scarcely a minute has passed when I was not either playing it, or wishing I was playing it. I am honestly slightly annoyed to be taking time away from it to write this review. I am a grown 34-year-old woman and video games rarely get their hooks into me the way they did when I was 8, or 18, and relatively free of responsibilities. But now and then, every few years, I play something that reminds me that video games are kind of magic. They can transport you somewhere else.
FlexVDW: A machine learning approach to account for protein flexibility in ligand docking
Suriana, Patricia, Paggi, Joseph M., Dror, Ron O.
Most widely used ligand docking methods assume a rigid protein structure. This leads to problems when the structure of the target protein deforms upon ligand binding. In particular, the ligand's true binding pose is often scored very unfavorably due to apparent clashes between ligand and protein atoms, which lead to extremely high values of the calculated van der Waals energy term. Traditionally, this problem has been addressed by explicitly searching for receptor conformations to account for the flexibility of the receptor in ligand binding. Here we present a deep learning model trained to take receptor flexibility into account implicitly when predicting van der Waals energy. We show that incorporating this machine-learned energy term into a state-of-the-art physics-based scoring function improves small molecule ligand pose prediction results in cases with substantial protein deformation, without degrading performance in cases with minimal protein deformation. This work demonstrates the feasibility of learning effects of protein flexibility on ligand binding without explicitly modeling changes in protein structure. A critical problem in rational drug discovery is prediction of the position, orientation, and conformation of a ligand (e.g., a drug candidate) when bound to a target protein--i.e., the ligand's "binding pose." Protein-ligand docking methods, which are used to predict ligand binding poses, are key tools in drug discovery and molecular modeling applications (Kitchen et al., 2004; Ferreira et al., 2015).
Flying snakes could help to design next-generation robotics
Animals inspiring robot design is not a new phenomenon, as robots have commonly been developed to mimic animal movements such as walking and swimming. Now, US researchers are investigating how to design robots that imitate the gliding motion performed by flying snakes. The study, 'Computational analysis of vortex dynamics and aerodynamic performance in flying-snake-like gliding flight with horizontal undulation,' is published in Physics of Fluids. The investigation analysed the lift production mechanism of flying snakes that undulate side-to-side as they travel from the tops of trees to the ground to evade predators or move efficiently. This undulation enables flying snakes to glide for great distances – as far as 25 metres from a 15-metre height.
How Does DALL·E-2 Work?
DALL·E-2 is a new AI system that can create realistic images and art from a description in natural language. Recently OpenAI just releases the beta version of DALL·E-2. In this article, we will take a close look at the original research paper of DALL·E-2 and understand how exactly it works. DALL·E-2 originates from this paper: Hierarchical Text-Conditional Image Generation with CLIP Latents [1]. DALL·E-2 is based on the unCLIP model proposed in this paper.