Goto

Collaborating Authors

 Materials


Self-Optimizing Machine Learning Potential Assisted Automated Workflow for Highly Efficient Complex Systems Material Design

arXiv.org Artificial Intelligence

Machine learning interatomic potentials have revolutionized complex materials design by enabling rapid exploration of material configurational spaces via crystal structure prediction with ab initio accuracy. However, critical challenges persist in ensuring robust generalization to unknown structures and minimizing the requirement for substantial expert knowledge and time-consuming manual interventions. Here, we propose an automated crystal structure prediction framework built upon the attention-coupled neural networks potential to address these limitations. The generalizability of the potential is achieved by sampling regions across the local minima of the potential energy surface, where the self-evolving pipeline autonomously refines the potential iteratively while minimizing human intervention. The workflow is validated on Mg-Ca-H ternary and Be-P-N-O quaternary systems by exploring nearly 10 million configurations, demonstrating substantial speedup compared to first-principles calculations. These results underscore the effectiveness of our approach in accelerating the exploration and discovery of complex multi-component functional materials.


How well can LLMs provide planning feedback in grounded environments?

arXiv.org Artificial Intelligence

Learning to plan in grounded environments typically requires carefully designed reward functions or high-quality annotated demonstrations. Recent works show that pretrained foundation models, such as large language models (LLMs) and vision language models (VLMs), capture background knowledge helpful for planning, which reduces the amount of reward design and demonstrations needed for policy learning. We evaluate how well LLMs and VLMs provide feedback across symbolic, language, and continuous control environments. We consider prominent types of feedback for planning including binary feedback, preference feedback, action advising, goal advising, and delta action feedback. We also consider inference methods that impact feedback performance, including in-context learning, chain-of-thought, and access to environment dynamics. We find that foundation models can provide diverse high-quality feedback across domains. Moreover, larger and reasoning models consistently provide more accurate feedback, exhibit less bias, and benefit more from enhanced inference methods. Finally, feedback quality degrades for environments with complex dynamics or continuous state spaces and action spaces.


A Multimodal RAG Framework for Housing Damage Assessment: Collaborative Optimization of Image Encoding and Policy Vector Retrieval

arXiv.org Artificial Intelligence

After natural disasters, accurate evaluations of damage to housing are important for insurance claims response and planning of resources. In this work, we introduce a novel multimodal retrieval-augmented generation (MM-RAG) framework. On top of classical RAG architecture, we further the framework to devise a two-branch multimodal encoder structure that the image branch employs a visual encoder composed of ResNet and Transformer to extract the characteristic of building damage after disaster, and the text branch harnesses a BERT retriever for the text vectorization of posts as well as insurance policies and for the construction of a retrievable restoration index. To impose cross-modal semantic alignment, the model integrates a cross-modal interaction module to bridge the semantic representation between image and text via multi-head attention. Meanwhile, in the generation module, the introduced modal attention gating mechanism dynamically controls the role of visual evidence and text prior information during generation. The entire framework takes end-to-end training, and combines the comparison loss, the retrieval loss and the generation loss to form multi-task optimization objectives, and achieves image understanding and policy matching in collaborative learning. The results demonstrate superior performance in retrieval accuracy and classification index on damage severity, where the Top-1 retrieval accuracy has been improved by 9.6%.


Documents Are People and Words Are Items: A Psychometric Approach to Textual Data with Contextual Embeddings

arXiv.org Artificial Intelligence

This research introduces a novel psychometric method for analyzing textual data using large language models. By leveraging contextual embeddings to create contextual scores, we transform textual data into response data suitable for psychometric analysis. Treating documents as individuals and words as items, this approach provides a natural psychometric interpretation under the assumption that certain keywords, whose contextual meanings vary significantly across documents, can effectively differentiate documents within a corpus. The modeling process comprises two stages: obtaining contextual scores and performing psychometric analysis. In the first stage, we utilize natural language processing techniques and encoder based transformer models to identify common keywords and generate contextual scores. In the second stage, we employ various types of factor analysis, including exploratory and bifactor models, to extract and define latent factors, determine factor correlations, and identify the most significant words associated with each factor. Applied to the Wiki STEM corpus, our experimental results demonstrate the method's potential to uncover latent knowledge dimensions and patterns within textual data. This approach not only enhances the psychometric analysis of textual data but also holds promise for applications in fields rich in textual information, such as education, psychology, and law.


Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

arXiv.org Artificial Intelligence

Materials characterization is fundamental to acquiring materials information, revealing the processing-microstructure-property relationships that guide material design and optimization. While multimodal large language models (MLLMs) have recently shown promise in generative and predictive tasks within materials science, their capacity to understand real-world characterization imaging data remains underexplored. To bridge this gap, we present MatCha, the first benchmark for materials characterization image understanding, comprising 1,500 questions that demand expert-level domain expertise. MatCha encompasses four key stages of materials research comprising 21 distinct tasks, each designed to reflect authentic challenges faced by materials scientists. Our evaluation of state-of-the-art MLLMs on MatCha reveals a significant performance gap compared to human experts. These models exhibit degradation when addressing questions requiring higher-level expertise and sophisticated visual perception. Simple few-shot and chain-of-thought prompting struggle to alleviate these limitations. These findings highlight that existing MLLMs still exhibit limited adaptability to real-world materials characterization scenarios. We hope MatCha will facilitate future research in areas such as new material discovery and autonomous scientific agents. MatCha is available at https://github.com/FreedomIntelligence/MatCha.


Rapid Manufacturing of Lightweight Drone Frames Using Single-Tow Architected Composites

arXiv.org Artificial Intelligence

The demand for lightweight and high-strength composite structures is rapidly growing in aerospace and robotics, particularly for optimized drone frames. However, conventional composite manufacturing methods struggle to achieve complex 3D architectures for weight savings and rely on assembling separate components, which introduce weak points at the joints. Additionally, maintaining continuous fiber reinforcement remains challenging, limiting structural efficiency. In this study, we demonstrate the lightweight Face Centered Cubic (FFC) lattice structured conceptualization of drone frames for weight reduction and complex topology fabrication through 3D Fiber Tethering (3DFiT) using continuous single tow fiber ensuring precise fiber alignment, eliminating weak points associated with traditional composite assembly. Mechanical testing demonstrates that the fabricated drone frame exhibits a high specific strength of around four to eight times the metal and thermoplastic, outperforming other conventional 3D printing methods. The drone frame weighs only 260 g, making it 10% lighter than the commercial DJI F450 frame, enhancing structural integrity and contributing to an extended flight time of three minutes, while flight testing confirms its stability and durability under operational conditions. The findings demonstrate the potential of single tow lattice truss-based drone frames, with 3DFiT serving as a scalable and efficient manufacturing method.


The Download: Trump's impact on science, and meet our climate and energy honorees

MIT Technology Review

The Download: Trump's impact on science, and meet our climate and energy honorees How Trump's policies are affecting early-career scientists--in their own words Every year MIT Technology Review celebrates accomplished young scientists, entrepreneurs, and inventors from around the world in our Innovators Under 35 list. We've just published the 2025 edition . This year, though, the context is different: The US scientific community is under attack. Since Donald Trump took office in January, his administration has fired top government scientists, targeted universities and academia, and made substantial funding cuts to the country's science and technology infrastructure. We asked our six most recent cohorts about both positive and negative impacts of the administration's new policies. Their responses provide a glimpse into the complexities of building labs, companies, and careers in today's political climate.


Benchmarking Universal Interatomic Potentials on Zeolite Structures

arXiv.org Artificial Intelligence

Interatomic potentials (IPs) with wide elemental coverage and high accuracy are powerful tools for high-throughput materials discovery. While the past few years witnessed the development of multiple new universal IPs that cover wide ranges of the periodic table, their applicability to target chemical systems should be carefully investigated. We benchmark several universal IPs using equilibrium zeolite structures as testbeds. We select a diverse set of universal IPs encompassing two major categories: (i) universal analytic IPs, including GFN-FF, UFF, and Dreiding; (ii) pretrained universal machine learning IPs (MLIPs), comprising CHGNet, ORB-v3, MatterSim, eSEN-30M-OAM, PFP-v7, and EquiformerV2-lE4-lF100-S2EFS-OC22. We compare them with established tailor-made IPs, SLC, ClayFF, and BSFF using experimental data and density functional theory (DFT) calculations with dispersion correction as the reference. The tested zeolite structures comprise pure silica frameworks and aluminosilicates containing copper species, potassium, and organic cations. We found that GFN-FF is the best among the tested universal analytic IPs, but it does not achieve satisfactory accuracy for highly strained silica rings and aluminosilicate systems. All MLIPs can well reproduce experimental or DFT-level geometries and energetics. Among the universal MLIPs, the eSEN-30M-OAM model shows the most consistent performance across all zeolite structures studied. These findings show that the modern pretrained universal MLIPs are practical tools in zeolite screening workflows involving various compositions.


RoboChemist: Long-Horizon and Safety-Compliant Robotic Chemical Experimentation

arXiv.org Artificial Intelligence

Robotic chemists promise to both liberate human experts from repetitive tasks and accelerate scientific discovery, yet remain in their infancy. Chemical experiments involve long-horizon procedures over hazardous and deformable substances, where success requires not only task completion but also strict compliance with experimental norms. To address these challenges, we propose \textit{RoboChemist}, a dual-loop framework that integrates Vision-Language Models (VLMs) with Vision-Language-Action (VLA) models. Unlike prior VLM-based systems (e.g., VoxPoser, ReKep) that rely on depth perception and struggle with transparent labware, and existing VLA systems (e.g., RDT, pi0) that lack semantic-level feedback for complex tasks, our method leverages a VLM to serve as (1) a planner to decompose tasks into primitive actions, (2) a visual prompt generator to guide VLA models, and (3) a monitor to assess task success and regulatory compliance. Notably, we introduce a VLA interface that accepts image-based visual targets from the VLM, enabling precise, goal-conditioned control. Our system successfully executes both primitive actions and complete multi-step chemistry protocols. Results show 23.57% higher average success rate and a 0.298 average increase in compliance rate over state-of-the-art VLA baselines, while also demonstrating strong generalization to objects and tasks.


Scientists are finally learning what's inside mysterious 'halo' barrels submerged off US coast

Daily Mail - Science & tech

The suspect in Charlie Kirk's assassination has been captured, FBI director Kash Patel announced MSNBC sparks outrage for'disgusting' Charlie Kirk comments following Utah shooting Tragedy as Charlie Kirk's wife left behind with two young children after conservative activist is fatally shot A DEI mayor, an inconvenient crime and video they never wanted you to see: MAUREEN CALLAHAN knows why the Left has sympathy for that killer... but none for his victim Sweater weather starts here - the cozy, chic pieces from Soft Surroundings you'll actually wear all season We only had one symptom we dismissed... but then we were diagnosed with the rarest form of melanoma Soft-touch prosecutor let felon walk free... before crook'slit Auburn professor's throat in random attack' I tried the 30 cent'miracle chill pill' before a big event.. now I'm taking it for everything Donald Trump and House Republicans lead prayers for Charlie Kirk's family after conservative star is fatally shot Prince Harry says his father King Charles is'great' following their first meeting in 19 months... which was over a cup of tea and just 55 minutes long Liberal media defends thug who killed Ukrainian woman in cold blood: 'This man was hurting' Knifeman accused of stabbing Ukrainian refugee to death gives chilling reason for the attack... as he speaks for the first time from jail on the murder that shocked America Fox News reveals new lineup and elevates star White House reporter who's sparred with Trump Horrific new details of passenger injuries after they were'thrown' around Delta flight during'severe turbulence' Scientists are finally learning what's inside mysterious'halo' barrels submerged off US coast Scientists are just beginning to learn what is inside thousands of mysterious'halo' barrels submerged off the US coast. The barrels were discovered in the deep waters of the San Pedro Basin, near Los Angeles, in 2021. Scientists were initially worried that the barrels could contain DDT, a toxic pesticide that was banned in 1972 due to its serious environmental and health impact. However, a new study now shows that the barrels contain an unknown caustic alkali waste, which is creating eerie halos as it leaches into the sea floor. Using the remotely operated vehicle (ROV) SuBastian, the researchers carefully collected samples at a set distance from barrels with halos.