AITopics | Materials

Collaborating Authors

Materials

Four thoughts from Bill Gates on climate tech

MIT Technology ReviewOct-30-2025, 11:00:00 GMT

Why he thinks near-term targets can be a distraction, and what technologies he expects to power our future grid. Bill Gates doesn't shy away or pretend modesty when it comes to his stature in the climate world today. "Well, who's the biggest funder of climate innovation companies?" he asked a handful of journalists at a media roundtable event last week. "If there's someone else, I've never met them." The former Microsoft CEO has spent the last decade investing in climate technology through Breakthrough Energy, which he founded in 2015. Ahead of the UN climate meetings kicking off next week, Gates published a memo outlining what he thinks activists and negotiators should focus on and how he's thinking about the state of climate tech right now.

bill gate, climate tech, mit technology review, (13 more...)

MIT Technology Review

Country:

North America > United States > Massachusetts (0.05)
Asia > China (0.05)

Industry:

Materials (0.71)
Energy > Power Industry (0.50)
Media > News (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Artificial Intelligence for Direct Prediction of Molecular Dynamics Across Chemical Space

Ge, Fuchun, Dral, Pavlo O.

arXiv.org Artificial IntelligenceOct-30-2025

Molecular dynamics (MD) is a powerful tool for exploring the behavior of atomistic systems, but its reliance on sequential numerical integration limits simulation efficiency. We present a novel neural network architecture, MDtrajNet, and a pre-trained foundational model, MDtrajNet-1, that directly generates MD trajectories across chemical space, bypassing force calculations and integration. This approach accelerates simulations by up to two orders of magnitude compared to traditional MD, even those enhanced by machine-learning interatomic potentials. MDtrajNet combines equivariant neural networks with a transformer-based architecture to achieve strong accuracy and transferability in predicting long-time trajectories. Remarkably, the errors of the trajectories generated by MDtrajNet-1 for various known and unseen molecular systems are close to those of the conventional ab initio MD. The architecture's flexible design supports diverse application scenarios, including different statistical ensembles, boundary conditions, and interaction types. By overcoming the intrinsic speed barrier of conventional MD, MDtrajNet opens new frontiers in efficient and scalable atomistic simulations.

artificial intelligence, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2505.16301

Country:

Europe (0.46)
Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Energy (0.94)
Materials > Chemicals (0.50)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MLPrE -- A tool for preprocessing and exploratory data analysis prior to machine learning model construction

Maxwell, David S, Darkoh, Michael, Samudrala, Sidharth R, Chung, Caroline, Schmidt, Stephanie T, Al-Lazikani, Bissan

arXiv.org Artificial IntelligenceOct-30-2025

With the recent growth of Deep Learning for AI, there is a need for tools to meet the demand of data flowing into those models. In some cases, source data may exist in multiple formats, and therefore the source data must be investigated and properly engineered for a Machine Learning model or graph database. Overhead and lack of scalability with existing workflows limit integration within a larger processing pipeline such as Apache Airflow, driving the need for a robust, extensible, and lightweight tool to preprocess arbitrary datasets that scales with data type and size. To address this, we present Machine Learning Preprocessing and Exploratory Data Analysis, MLPrE, in which SparkDataFrames were utilized to hold data during processing and ensure scalability. A generalizable JSON input file format was utilized to describe stepwise changes to that DataFrame. Stages were implemented for input and output, filtering, basic statistics, feature engineering, and exploratory data analysis. A total of 69 stages were implemented into MLPrE, of which we highlight and demonstrate key stages using six diverse datasets. We further highlight MLPrE's ability to independently process multiple fields in flat files and recombine them, otherwise requiring an additional pipeline, using a UniProt glossary term dataset. Building on this advantage, we demonstrated the clustering stage with available wine quality data. Lastly, we demonstrate the preparation of data for a graph database in the final stages of MLPrE using phosphosite kinase data. Overall, our MLPrE tool offers a generalizable and scalable tool for preprocessing and early data analysis, filling a critical need for such a tool given the ever expanding use of machine learning. This tool serves to accelerate and simplify early stage development in larger workflows.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.25755

Country: North America > United States > Texas (0.15)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.94)
Materials > Chemicals (0.68)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

Song, Chao, Liu, Zhiyuan, Huang, Han, Wang, Liang, Wang, Qiong, Shi, Jianyu, Yu, Hui, Zhou, Yihang, Zhang, Yang

arXiv.org Artificial IntelligenceOct-30-2025

Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.25132

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials > Chemicals (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models

Huang, Yukun, Chen, Sanxing, Pei, Jian, Zaheer, Manzil, Dhingra, Bhuwan

arXiv.org Artificial IntelligenceOct-30-2025

Trustworthy language models should provide both correct and verifiable answers. However, citations generated directly by standalone LLMs are often unreliable. As a result, current systems insert citations by querying an external retriever at inference time, introducing latency, infrastructure dependence, and vulnerability to retrieval noise. We explore whether LLMs can be made to reliably attribute to the documents seen during continual pretraining without test-time retrieval, by revising the training process. To study this, we construct CitePretrainBench, a benchmark that mixes real-world corpora (Wikipedia, Common Crawl, arXiv) with novel documents and probes both short-form (single-fact) and long-form (multi-fact) citation tasks. Our approach follows a two-stage process: (1) continual pretraining to index factual knowledge by binding it to persistent document identifiers; and (2) instruction tuning to elicit citation behavior. We introduce Active Indexing for the first stage, which creates generalizable, source-anchored bindings by augmenting training with synthetic data that (i) restate each fact in diverse, compositional forms and (ii) enforce bidirectional training (source-to-fact and fact-to-source). This equips the model to both generate content from a cited source and attribute its own answers, improving robustness to paraphrase and composition. Experiments with Qwen-2.5-7B&3B show that Active Indexing consistently outperforms a Passive Indexing baseline, which simply appends an identifier to each document, achieving citation precision gains of up to 30.2% across all tasks and models. Our ablation studies reveal that performance continues to improve as we scale the amount of augmented data, showing a clear upward trend even at 16x the original token count. Finally, we show that internal citations complement external ones by making the model more robust to retrieval noise.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.17585

Country:

North America > United States (1.00)
Asia (0.68)
Europe > Austria > Vienna (0.15)

Genre: Research Report (0.52)

Industry:

Materials > Chemicals > Commodity Chemicals (0.46)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Geometric Mixture Models for Electrolyte Conductivity Prediction

Li, Anyi, Cen, Jiacheng, Li, Songyou, Li, Mingze, Yu, Yang, Huang, Wenbing

arXiv.org Artificial IntelligenceOct-29-2025

Accurate prediction of ionic conductivity in electrolyte systems is crucial for advancing numerous scientific and technological applications. While significant progress has been made, current research faces two fundamental challenges: (1) the lack of high-quality standardized benchmarks, and (2) inadequate modeling of geometric structure and intermolecular interactions in mixture systems. To address these limitations, we first reorganize and enhance the CALiSol and DiffMix electrolyte datasets by incorporating geometric graph representations of molecules. We then propose GeoMix, a novel geometry-aware framework that preserves Set-SE(3) equivariance-an essential but challenging property for mixture systems. At the heart of GeoMix lies the Geometric Interaction Network (GIN), an equivariant module specifically designed for intermolecular geometric message passing. Comprehensive experiments demonstrate that GeoMix consistently outperforms diverse baselines (including MLPs, GNNs, and geometric GNNs) across both datasets, validating the importance of cross-molecular geometric interactions and equivariant message passing for accurate property prediction. This work not only establishes new benchmarks for electrolyte research but also provides a general geometric learning framework that advances modeling of mixture systems in energy materials, pharmaceutical development, and beyond.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.15403

Genre: Research Report (0.50)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Energy > Energy Storage (0.68)
Electrical Industrial Apparatus (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

An Adaptive Inspection Planning Approach Towards Routine Monitoring in Uncertain Environments

Viswanathan, Vignesh Kottayam, Bai, Yifan, Fredriksson, Scott, Satpute, Sumeet, Kanellakis, Christoforos, Nikolakopoulos, George

arXiv.org Artificial IntelligenceOct-29-2025

In this work, we present a hierarchical framework designed to support robotic inspection under environment uncertainty. By leveraging a known environment model, existing methods plan and safely track inspection routes to visit points of interest. However, discrepancies between the model and actual site conditions, caused by either natural or human activities, can alter the surface morphology or introduce path obstructions. To address this challenge, the proposed framework divides the inspection task into: (a) generating the initial global view-plan for region of interests based on a historical map and (b) local view replanning to adapt to the current morphology of the inspection scene. The proposed hierarchy preserves global coverage objectives while enabling reactive adaptation to the local surface morphology. This enables the local autonomy to remain robust against environment uncertainty and complete the inspection tasks. We validate the approach through deployments in real-world subterranean mines using quadrupedal robot.

artificial intelligence, inspection, surface morphology, (15 more...)

arXiv.org Artificial Intelligence

2510.24554

Genre: Research Report (0.64)

Industry: Materials > Metals & Mining (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Causal Convolutional Neural Networks as Finite Impulse Response Filters

Bacsa, Kiran, Liu, Wei, Jian, Xudong, Liang, Huangbin, Chatzi, Eleni

arXiv.org Artificial IntelligenceOct-29-2025

Abstract--This study investigates the behavior of Causal Con-volutional Neural Networks (CNNs) with quasi-linear activation functions when applied to time-series data characterized by mul-timodal frequency content. We demonstrate that, once trained, such networks exhibit properties analogous to Finite Impulse Response (FIR) filters, particularly when the convolutional kernels are of extended length exceeding those typically employed in standard CNN architectures. Causal CNNs are shown to capture spectral features both implicitly and explicitly, offering enhanced interpretability for tasks involving dynamic systems. Leveraging the associative property of convolution, we further show that the entire network can be reduced to an equivalent single-layer filter resembling an FIR filter optimized via least-squares criteria. This equivalence yields new insights into the spectral learning behavior of CNNs trained on signals with sparse frequency content. The approach is validated on both simulated beam dynamics and real-world bridge vibration datasets, underlining its relevance for modeling and identifying physical systems governed by dynamic responses. Neural networks have enjoyed wide-spread adoption across various modeling tasks, despite the common pitfall of typically comprising black box models that are often difficult to interpret [1]. It is therefore challenging to tailor a neural network model according to the characteristics of a specific problem: how can we introduce a bias inside a black box? A common way to introduce biases is through the architecture of the neural network. For example, Convolution Neural Networks employ convolutional kernels to force the network to focus on local correlations, which is different from the global connectivity of Multi-Layer Perceptrons. This bias is useful for image processing tasks, where the information of a single pixel is highly correlated with its surrounding pixels [2]. For physics-informed neural networks [3], the bias to be introduced should reflect the prior knowledge on the physical laws that govern the phenomenon that the model is trying to replicate. Due to the black box nature of neural networks, such biases need to be implemented explicitly, e.g. with a physics-informed loss function, rather than an implicit bias in the architecture of the model. In the case of the dynamical behavior of physical systems, a desirable bias should capture the dynamic properties of a system.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2510.24125

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Materials > Construction Materials (0.93)
Transportation (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MolErr2Fix: Benchmarking LLM Trustworthiness in Chemistry via Modular Error Detection, Localization, Explanation, and Revision

Wu, Yuyang, Ye, Jinhui, Zhang, Shuhao, Dai, Lu, Bisk, Yonatan, Isayev, Olexandr

arXiv.org Artificial IntelligenceOct-29-2025

Large Language Models (LLMs) have shown growing potential in molecular sciences, but they often produce chemically inaccurate descriptions and struggle to recognize or justify potential errors. This raises important concerns about their robustness and reliability in scientific applications. To support more rigorous evaluation of LLMs in chemical reasoning, we present the MolErr2Fix benchmark, designed to assess LLMs on error detection and correction in molecular descriptions. Unlike existing benchmarks focused on molecule-to-text generation or property prediction, MolErr2Fix emphasizes fine-grained chemical understanding. It tasks LLMs with identifying, localizing, explaining, and revising potential structural and semantic errors in molecular descriptions. Specifically, MolErr2Fix consists of 1,193 fine-grained annotated error instances. Each instance contains quadruple annotations, i.e,. (error type, span location, the explanation, and the correction). These tasks are intended to reflect the types of reasoning and verification required in real-world chemical communication. Evaluations of current state-of-the-art LLMs reveal notable performance gaps, underscoring the need for more robust chemical reasoning capabilities. MolErr2Fix provides a focused benchmark for evaluating such capabilities and aims to support progress toward more reliable and chemically informed language models. All annotations and an accompanying evaluation API will be publicly released to facilitate future research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.00063

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals > Commodity Chemicals (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback

Liu, Wanhao, Yang, Zonglin, Wang, Jue, Bing, Lidong, Zhang, Di, Zhou, Dongzhan, Li, Yuqiang, Li, Houqiang, Cambria, Erik, Ouyang, Wanli

arXiv.org Artificial IntelligenceOct-28-2025

Hypothesis ranking is vital for automated scientific discovery, especially in cost-intensive, throughput-limited natural science domains. Current methods focus on pre-experiment ranking, relying solely on language model reasoning without empirical feedback. We introduce experiment-guided ranking, which prioritizes hypotheses based on feedback from prior tests. Due to the impracticality of real experiments, we propose a simulator grounded in domain-specific concepts that models hypothesis performance as a function of similarity to a hidden ground truth, perturbed by noise. Validated against 124 hypotheses with experimentally reported outcomes, the simulator approximates real results with consistent trend alignment. Although deviations exist, they mimic wet-lab noise, promoting more robust ranking strategies. We frame experiment-guided ranking as a sequential decision-making problem and propose an in-context reinforcement learning (ICRL) framework. Our LLM-based policy decomposes hypotheses into functional elements, clusters them by mechanistic roles, and prioritizes recombinations based on feedback. Experiments show our approach significantly outperforms pre-experiment baselines and strong ablations. Our toolkit, comprising the simulator and ICRL framework, enables systematic research on experiment-guided ranking, with the policy serving as a strong proof of concept.

large language model, machine learning, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2505.17873

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (1.00)
Health & Medicine (0.93)
Materials > Chemicals > Commodity Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback