Materials
Learning Equivariant Non-Local Electron Density Functionals
Gao, Nicholas, Eberhard, Eike, Günnemann, Stephan
The accuracy of density functional theory hinges on the approximation of non-local contributions to the exchange-correlation (XC) functional. To date, machine-learned and human-designed approximations suffer from insufficient accuracy, limited scalability, or dependence on costly reference data. To address these issues, we introduce Equivariant Graph Exchange Correlation (EG-XC), a novel non-local XC functional based on equivariant graph neural networks. EG-XC combines semi-local functionals with a non-local feature density parametrized by an equivariant nuclei-centered point cloud representation of the electron density to capture long-range interactions. By differentiating through a self-consistent field solver, we train EG-XC requiring only energy targets. In our empirical evaluation, we find EG-XC to accurately reconstruct `gold-standard' CCSD(T) energies on MD17. On out-of-distribution conformations of 3BPA, EG-XC reduces the relative MAE by 35% to 50%. Remarkably, EG-XC excels in data efficiency and molecular size extrapolation on QM9, matching force fields trained on 5 times more and larger molecules. On identical training sets, EG-XC yields on average 51% lower MAEs.
Pretraining Graph Transformers with Atom-in-a-Molecule Quantum Properties for Improved ADMET Modeling
Fallani, Alessio, Nugmanov, Ramil, Arjona-Medina, Jose, Wegner, Jörg Kurt, Tkatchenko, Alexandre, Chernichenko, Kostiantyn
We evaluate the impact of pretraining Graph Transformer architectures on atom-level quantum-mechanical features for the modeling of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of drug-like compounds. We compare this pretraining strategy with two others: one based on molecular quantum properties (specifically the HOMO-LUMO gap) and one using a self-supervised atom masking technique. After fine-tuning on Therapeutic Data Commons ADMET datasets, we evaluate the performance improvement in the different models observing that models pretrained with atomic quantum mechanical properties produce in general better results. We then analyse the latent representations and observe that the supervised strategies preserve the pretraining information after finetuning and that different pretrainings produce different trends in latent expressivity across layers. Furthermore, we find that models pretrained on atomic quantum mechanical properties capture more low-frequency laplacian eigenmodes of the input graph via the attention weights and produce better representations of atomic environments within the molecule. Application of the analysis to a much larger non-public dataset for microsomal clearance illustrates generalizability of the studied indicators. In this case the performances of the models are in accordance with the representation analysis and highlight, especially for the case of masking pretraining and atom-level quantum property pretraining, how model types with similar performance on public benchmarks can have different performances on large scale pharmaceutical data.
InstructBioMol: Advancing Biomolecule Understanding and Design Following Human Instructions
Zhuang, Xiang, Ding, Keyan, Lyu, Tianwen, Jiang, Yinuo, Li, Xiaotong, Xiang, Zhuoyi, Wang, Zeyuan, Qin, Ming, Feng, Kehua, Wang, Jike, Zhang, Qiang, Chen, Huajun
Understanding and designing biomolecules, such as proteins and small molecules, is central to advancing drug discovery, synthetic biology, and enzyme engineering. Recent breakthroughs in Artificial Intelligence (AI) have revolutionized biomolecular research, achieving remarkable accuracy in biomolecular prediction and design. However, a critical gap remains between AI's computational power and researchers' intuition, using natural language to align molecular complexity with human intentions. Large Language Models (LLMs) have shown potential to interpret human intentions, yet their application to biomolecular research remains nascent due to challenges including specialized knowledge requirements, multimodal data integration, and semantic alignment between natural language and biomolecules. To address these limitations, we present InstructBioMol, a novel LLM designed to bridge natural language and biomolecules through a comprehensive any-to-any alignment of natural language, molecules, and proteins. This model can integrate multimodal biomolecules as input, and enable researchers to articulate design goals in natural language, providing biomolecular outputs that meet precise biological needs. Experimental results demonstrate InstructBioMol can understand and design biomolecules following human instructions. Notably, it can generate drug molecules with a 10% improvement in binding affinity and design enzymes that achieve an ESP Score of 70.4, making it the only method to surpass the enzyme-substrate interaction threshold of 60.0 recommended by the ESP developer. This highlights its potential to transform real-world biomolecular research.
From Logits to Hierarchies: Hierarchical Clustering made Simple
Palumbo, Emanuele, Vandenhirtz, Moritz, Ryser, Alain, Daunhawer, Imant, Vogt, Julia E.
The structure of many real-world datasets is intrinsically hierarchical, making the modeling of such hierarchies a critical objective in both unsupervised and supervised machine learning. Recently, novel approaches for hierarchical clustering with deep architectures have been proposed. In this work, we take a critical perspective on this line of research and demonstrate that many approaches exhibit major limitations when applied to realistic datasets, partly due to their high computational complexity. In particular, we show that a lightweight procedure implemented on top of pre-trained non-hierarchical clustering models outperforms models designed specifically for hierarchical clustering. Our proposed approach is computationally efficient and applicable to any pre-trained clustering model that outputs logits, without requiring any fine-tuning. To highlight the generality of our findings, we illustrate how our method can also be applied in a supervised setup, recovering meaningful hierarchies from a pre-trained ImageNet classifier.
Patterned Structure Muscle : Arbitrary Shaped Wire-driven Artificial Muscle Utilizing Anisotropic Flexible Structure for Musculoskeletal Robots
Yoshimura, Shunnosuke, Miki, Akihiro, Miyama, Kazuhiro, Sahara, Yuta, Kawaharazuka, Kento, Okada, Kei, Inaba, Masayuki
Muscles of the human body are composed of tiny actuators made up of myosin and actin filaments. They can exert force in various shapes such as curved or flat, under contact forces and deformations from the environment. On the other hand, muscles in musculoskeletal robots so far have faced challenges in generating force in such shapes and environments. To address this issue, we propose Patterned Structure Muscle (PSM), artificial muscles for musculoskeletal robots. PSM utilizes patterned structures with anisotropic characteristics, wire-driven mechanisms, and is made of flexible material Thermoplastic Polyurethane (TPU) using FDM 3D printing. This method enables the creation of various shapes of muscles, such as simple 1 degree-of-freedom (DOF) muscles, Multi-DOF wide area muscles, joint-covering muscles, and branched muscles. We created an upper arm structure using these muscles to demonstrate wide range of motion, lifting heavy objects, and movements through environmental contact. These experiments show that the proposed PSM is capable of operating in various shapes and environments, and is suitable for the muscles of musculoskeletal robots.
Hybrid Gripper with Passive Pneumatic Soft Joints for Grasping Deformable Thin Objects
Tran, Ngoc-Duy, Ly, Hoang-Hiep, Nguyen, Xuan-Thuan, Mac, Thi-Thoa, Nguyen, Anh, Ta, Tung D.
Grasping a variety of objects remains a key challenge in the development of versatile robotic systems. The human hand is remarkably dexterous, capable of grasping and manipulating objects with diverse shapes, mechanical properties, and textures. Inspired by how humans use two fingers to pick up thin and large objects such as fabric or sheets of paper, we aim to develop a gripper optimized for grasping such deformable objects. Observing how the soft and flexible fingertip joints of the hand approach and grasp thin materials, a hybrid gripper design that incorporates both soft and rigid components was proposed. The gripper utilizes a soft pneumatic ring wrapped around a rigid revolute joint to create a flexible two-fingered gripper. Experiments were conducted to characterize and evaluate the gripper performance in handling sheets of paper and other objects. Compared to rigid grippers, the proposed design improves grasping efficiency and reduces the gripping distance by up to eightfold.
298 Best Prime Day Deals, Vetted By Our Amazon Experts (Oct 2024)
Amazon's fall Prime Day sale--also known as Big Deals Days--ends tonight. It's October, yes, but it's never too early to jump on that holiday gift shopping. We've combed through the deals and found the best ones, based on our years of testing and reviewing. WIRED's picks for the best Prime Day deals only include products someone from our team has personally tested and reviewed. We track prices using several tools to avoid falling for fake discounts. There are no shoddy knockoffs or overpriced products among our recommendations, just good deals on good stuff. We've linked our reviews and buying guide throughout to help you make fully informed buying decisions. We test products year-round and handpicked these Prime Day deals. We'll update this guide regularly throughout Prime Day by adding fresh deals and removing dead deals. This is our favorite e-reader. You'll have the choice between the base Paperwhite and the Signature Edition (8/10, WIRED Recommends), which comes with 16 gigabytes ...
Collective variables of neural networks: empirical time evolution and scaling laws
Tovey, Samuel, Krippendorf, Sven, Spannowsky, Michael, Nikolaou, Konstantin, Holm, Christian
This work presents a novel means for understanding learning dynamics and scaling relations in neural networks. We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network and how these can be improved through architecture scaling. These results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies. In testing on a wide range of architectures, we highlight the universal nature of training dynamics and further discuss how it can be used to understand the mechanisms behind learning in neural networks. We identify two such dominant mechanisms present throughout machine learning training. The first, information compression, is seen through a reduction in the entropy of the NTK spectrum during training, and occurs predominantly in small neural networks. The second, coined structure formation, is seen through an increasing entropy and thus, the creation of structure in the neural network representations beyond the prior established by the network at initialization. Due to the ubiquity of the latter in deep neural network architectures and its flexibility in the creation of feature-rich representations, we argue that this form of evolution of the network's entropy be considered the onset of a deep learning regime.
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting
Liu, Xukai, Liu, Ye, Zhang, Kai, Wang, Kehang, Liu, Qi, Chen, Enhong
Entity Linking (EL) is the process of associating ambiguous textual mentions to specific entities in a knowledge base. Traditional EL methods heavily rely on large datasets to enhance their performance, a dependency that becomes problematic in the context of few-shot entity linking, where only a limited number of examples are available for training. To address this challenge, we present OneNet, an innovative framework that utilizes the few-shot learning capabilities of Large Language Models (LLMs) without the need for fine-tuning. To the best of our knowledge, this marks a pioneering approach to applying LLMs to few-shot entity linking tasks. OneNet is structured around three key components prompted by LLMs: (1) an entity reduction processor that simplifies inputs by summarizing and filtering out irrelevant entities, (2) a dual-perspective entity linker that combines contextual cues and prior knowledge for precise entity linking, and (3) an entity consensus judger that employs a unique consistency algorithm to alleviate the hallucination in the entity linking reasoning. Comprehensive evaluations across seven benchmark datasets reveal that OneNet outperforms current state-of-the-art entity linking methods.
Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration
Li, Qinglun, Zhang, Miao, Liu, Yingqi, Yin, Quanjun, Shen, Li, Cao, Xiaochun
--Decentralized Federated Learning has emerged as an alternative to centralized architectures due to its faster training, privacy preservation, and reduced communication overhead. In decentralized communication, the server aggregation phase in Centralized Federated Learning shifts to the client side, which means that clients connect with each other in a peer-to-peer manner . However, compared to the centralized mode, data heterogeneity in Decentralized Federated Learning will cause larger variances between aggregated models, which leads to slow convergence in training and poor generalization performance in tests. T o address these issues, we introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata. It consists of two main components: the Moreau envelope function, which primarily addresses parameter inconsistencies among clients caused by data heterogeneity, and Nesterov's extrapolation step, which accelerates the aggregation phase. Theoretically, We prove the optimization error bound and generalization error bound of the algorithm, providing a further understanding of the nature of the algorithm and the theoretical perspectives on the hyperparameter choice. Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIF AR10/100 with various non-iid data distributions. Furthermore, we also experimentally verify the theoretical properties of DFedCata. EDERA TED Learning (FL) is a new distributed machine learning paradigm that prioritizes privacy protection [1]- [3]. It enables multiple clients to collaborate on training models without sharing their raw data. Nowadays, much of the research [4]-[9] focus on Centralized Federated Learning (CFL), but the central server in CFL brings various challenges on communication burden, single point of failure [10], privacy breaches [11] and so on. In contrast, Decentralized Federated Learning (DFL) centralizes both the local update and aggregation steps on the client, which offers enhanced privacy protection [12], faster model training [13], and robustness to slow client devices [14]. Therefore, DFL has become a popular alternative solution [10], [13]. Qinglun Li, Miao Zhang, and Quanjun Yin are with the College of Systems Engineering, National University of Defense Technology. Yingqi Liu, Li Shen, and Xiaochun Cao are with the School of Cy-ber Science and Technology, Shenzhen Campus of Sun Y at-sen University, Shenzhen 518107, China. The optimization process diagrams for two clients under the DFedAvg and DFedCata algorithms are simulated. The primary improvements include two aspects.