Materials
Physics-informed generative model for drug-like molecule conformers
Williams, David C., Inala, Neil
We present a diffusion-based, generative model for conformer generation. Our model is focused on the reproduction of bonded structure and is constructed from the associated terms traditionally found in classical force fields to ensure a physically relevant representation. Techniques in deep learning are used to infer atom typing and geometric parameters from a training set. Conformer sampling is achieved by taking advantage of recent advancements in diffusion-based generation. By training on large, synthetic data sets of diverse, drug-like molecules optimized with the semiempirical GFN2-xTB method, high accuracy is achieved for bonded parameters, exceeding that of conventional, knowledge-based methods. Results are also compared to experimental structures from the Protein Databank (PDB) and Cambridge Structural Database (CSD).
A Universal Catalyst for First-Order Optimization Hongzhou Lin 1 Inria
We introduce a generic scheme for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. Our approach consists of minimizing a convex objective by approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. This strategy applies to a large class of algorithms, including gradient descent, block coordinate descent, SAG, SAGA, SDCA, SVRG, Finito/MISO, and their proximal variants. For all of these methods, we provide acceleration and explicit support for non-strongly convex objectives. In addition to theoretical speed-up, we also show that acceleration is useful in practice, especially for ill-conditioned problems where we measure significant improvements.
MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension
Lu, Xingyu, Cao, He, Liu, Zijing, Bai, Shengyuan, Chen, Leqing, Yao, Yuan, Zheng, Hai-Tao, Li, Yu
Large language models are playing an increasingly significant role in molecular research, yet existing models often generate erroneous information, posing challenges to accurate molecular comprehension. Traditional evaluation metrics for generated content fail to assess a model's accuracy in molecular understanding. To rectify the absence of factual evaluation, we present MoleculeQA, a novel question answering (QA) dataset which possesses 62K QA pairs over 23K molecules. Each QA pair, composed of a manual question, a positive option and three negative options, has consistent semantics with a molecular description from authoritative molecular corpus. MoleculeQA is not only the first benchmark for molecular factual bias evaluation but also the largest QA dataset for molecular research. A comprehensive evaluation on MoleculeQA for existing molecular LLMs exposes their deficiencies in specific areas and pinpoints several particularly crucial factors for molecular understanding.
Using Fiber Optic Bundles to Miniaturize Vision-Based Tactile Sensors
Di, Julia, Dugonjic, Zdravko, Fu, Will, Wu, Tingfan, Mercado, Romeo, Sawyer, Kevin, Most, Victoria Rose, Kammerer, Gregg, Speidel, Stefanie, Fan, Richard E., Sonn, Geoffrey, Cutkosky, Mark R., Lambeta, Mike, Calandra, Roberto
Vision-based tactile sensors have recently become popular due to their combination of low cost, very high spatial resolution, and ease of integration using widely available miniature cameras. The associated field of view and focal length, however, are difficult to package in a human-sized finger. In this paper we employ optical fiber bundles to achieve a form factor that, at 15 mm diameter, is smaller than an average human fingertip. The electronics and camera are also located remotely, further reducing package size. The sensor achieves a spatial resolution of 0.22 mm and a minimum force resolution 5 mN for normal and shear contact forces. With these attributes, the DIGIT Pinki sensor is suitable for applications such as robotic and teleoperated digital palpation. We demonstrate its utility for palpation of the prostate gland and show that it can achieve clinically relevant discrimination of prostate stiffness for phantom and ex vivo tissue.
Accurate Crystal Structure Prediction of New 2D Hybrid Organic Inorganic Perovskites
Karimitari, Nima, Baldwin, William J., Muller, Evan W., Bare, Zachary J. L., Kennedy, W. Joshua, Csรกnyi, Gรกbor, Sutton, Christopher
Low dimensional hybrid organic-inorganic perovskites (HOIPs) represent a promising class of electronically active materials for both light absorption and emission. The design space of HOIPs is extremely large, since a diverse space of organic cations can be combined with different inorganic frameworks. This immense design space allows for tunable electronic and mechanical properties, but also necessitates the development of new tools for in silico high throughput analysis of candidate structures. In this work, we present an accurate, efficient, transferable and widely applicable machine learning interatomic potential (MLIP) for predicting the structure of new 2D HOIPs. Using the MACE architecture, an MLIP is trained on 86 diverse experimentally reported HOIP structures. The model is tested on 73 unseen perovskite compositions, and achieves chemical accuracy with respect to the reference electronic structure method. Our model is then combined with a simple random structure search algorithm to predict the structure of hypothetical HOIPs given only the proposed composition. Success is demonstrated by correctly and reliably recovering the crystal structure of a set of experimentally known 2D perovskites. Such a random structure search is impossible with ab initio methods due to the associated computational cost, but is relatively inexpensive with the MACE potential. Finally, the procedure is used to predict the structure formed by a new organic cation with no previously known corresponding perovskite. Laboratory synthesis of the new hybrid perovskite confirms the accuracy of our prediction. This capability, applied at scale, enables efficient screening of thousands of combinations of organic cations and inorganic layers.
Materials science in the era of large language models: a perspective
Lei, Ge, Docherty, Ronan, Cooper, Samuel J.
Large Language Models (LLMs) have garnered considerable interest due to their impressive natural language capabilities, which in conjunction with various emergent properties make them versatile tools in workflows ranging from complex code generation to heuristic finding for combinatorial problems. In this paper we offer a perspective on their applicability to materials science research, arguing their ability to handle ambiguous requirements across a range of tasks and disciplines mean they could be a powerful tool to aid researchers. We qualitatively examine basic LLM theory, connecting it to relevant properties and techniques in the literature before providing two case studies that demonstrate their use in task automation and knowledge extraction at-scale. At their current stage of development, we argue LLMs should be viewed less as oracles of novel insight, and more as tireless workers that can accelerate and unify exploration across domains. It is our hope that this paper can familiarise material science researchers with the concepts needed to leverage these tools in their own research.
3M-Diffusion: Latent Multi-Modal Diffusion for Text-Guided Generation of Molecular Graphs
Zhu, Huaisheng, Xiao, Teng, Honavar, Vasant G
Generating molecules with desired properties is a critical task with broad applications in drug discovery and materials design. Inspired by recent advances in large language models, there is a growing interest in using natural language descriptions of molecules to generate molecules with the desired properties. Most existing methods focus on generating molecules that precisely match the text description. However, practical applications call for methods that generate diverse, and ideally novel, molecules with the desired properties. We propose 3M-Diffusion, a novel multi-modal molecular graph generation method, to address this challenge. 3M-Diffusion first encodes molecular graphs into a graph latent space aligned with text descriptions. It then reconstructs the molecular structure and atomic attributes based on the given text descriptions using the molecule decoder. It then learns a probabilistic mapping from the text space to the latent molecular graph space using a diffusion model. The results of our extensive experiments on several datasets demonstrate that 3M-Diffusion can generate high-quality, novel and diverse molecular graphs that semantically match the textual description provided.
Hybrid Soft Electrostatic Metamaterial Gripper for Multi-surface, Multi-object Adaptation
Kanno, Ryo, Nguyen, Pham H., Pinskier, Joshua, Howard, David, Song, Sukho, Kovac, Mirko
One of the trendsetting themes in soft robotics has been the goal of developing the ultimate universal soft robotic gripper. One that is capable of manipulating items of various shapes, sizes, thicknesses, textures, and weights. All the while still being lightweight and scalable in order to adapt to use cases. In this work, we report a soft gripper that enables delicate and precise grasps of fragile, deformable, and flexible objects but also excels in lifting heavy objects of up to 1617x its own body weight. The principle behind the soft gripper is based on extending the capabilities of electroadhesion soft grippers through the enhancement principles found in metamaterial adhesion cut and patterning. This design amplifies the adhesion and grasping payload in one direction while reducing the adhesion capabilities in the other direction. This counteracts the residual forces during peeling (a common problem with electroadhesive grippers), thus increasing its speed of release. In essence, we are able to tune the maximum strength and peeling speed, beyond the capabilities of previous electroadhesive grippers. We study the capabilities of the system through a wide range of experiments with single and multiple-fingered peel tests. We also demonstrate its modular and adaptive capabilities in the real-world with a two-finger gripper, by performing grasping tests of up to $5$ different multi-surfaced objects.
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
Wang, Zihao, Liu, Anji, Lin, Haowei, Li, Jiaqi, Ma, Xiaojian, Liang, Yitao
We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models' reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination. In particular, the proposed method -- *retrieval-augmented thoughts* (RAT) -- revises each thought step one by one with retrieved information relevant to the task query, the current and the past thought steps, after the initial zero-shot CoT is generated. Applying RAT to GPT-3.5, GPT-4, and CodeLLaMA-7b substantially improves their performances on various long-horizon generation tasks; on average of relatively increasing rating scores by 13.63% on code generation, 16.96% on mathematical reasoning, 19.2% on creative writing, and 42.78% on embodied task planning. The demo page can be found at https://craftjarvis.github.io/RAT