Materials
Deep Learning Models for Colloidal Nanocrystal Synthesis
Gu, Kai, Liang, Yingping, Su, Jiaming, Sun, Peihan, Peng, Jia, Miao, Naihua, Sun, Zhimei, Fu, Ying, Zhong, Haizheng, Zhang, Jun
Colloidal synthesis of nanocrystals usually includes complex chemical reactions and multi-step crystallization processes. Despite the great success in the past 30 years, it remains challenging to clarify the correlations between synthetic parameters of chemical reaction and physical properties of nanocrystals. Here, we developed a deep learning-based nanocrystal synthesis model that correlates synthetic parameters with the final size and shape of target nanocrystals, using a dataset of 3500 recipes covering 348 distinct nanocrystal compositions. The size and shape labels were obtained from transmission electron microscope images using a segmentation model trained with a semi-supervised algorithm on a dataset comprising 1.2 million nanocrystals. By applying the reaction intermediate-based data augmentation method and elaborated descriptors, the synthesis model was able to predict nanocrystal's size with a mean absolute error of 1.39 nm, while reaching an 89% average accuracy for shape classification. The synthesis model shows knowledge transfer capabilities across different nanocrystals with inputs of new recipes. With that, the influence of chemicals on the final size of nanocrystals was further evaluated, revealing the importance order of nanocrystal composition, precursor or ligand, and solvent. Overall, the deep learning-based nanocrystal synthesis model offers a powerful tool to expedite the development of high-quality nanocrystals.
GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion
Tang, Jiapeng, Davoli, Davide, Kirschstein, Tobias, Schoneveld, Liam, Niessner, Matthias
We propose a novel approach for reconstructing animatable 3D Gaussian avatars from monocular videos captured by commodity devices like smartphones. Photorealistic 3D head avatar reconstruction from such recordings is challenging due to limited observations, which leaves unobserved regions under-constrained and can lead to artifacts in novel views. To address this problem, we introduce a multi-view head diffusion model, leveraging its priors to fill in missing regions and ensure view consistency in Gaussian splatting renderings. To enable precise viewpoint control, we use normal maps rendered from FLAME-based head reconstruction, which provides pixel-aligned inductive biases. We also condition the diffusion model on VAE features extracted from the input image to preserve details of facial identity and appearance. For Gaussian avatar reconstruction, we distill multi-view diffusion priors by using iteratively denoised images as pseudo-ground truths, effectively mitigating over-saturation issues. To further improve photorealism, we apply latent upsampling to refine the denoised latent before decoding it into an image. We evaluate our method on the NeRSemble dataset, showing that GAF outperforms the previous state-of-the-art methods in novel view synthesis by a 5.34\% higher SSIM score. Furthermore, we demonstrate higher-fidelity avatar reconstructions from monocular videos captured on commodity devices.
Foundational Large Language Models for Materials Research
Mishra, Vaibhav, Singh, Somaditya, Ahlawat, Dhruv, Zaki, Mohd, Bihani, Vaibhav, Grover, Hargun Singh, Mishra, Biswajit, Miret, Santiago, Mausam, null, Krishnan, N. M. Anoop
Materials discovery and development are critical for addressing global challenges. Yet, the exponential growth in materials science literature comprising vast amounts of textual data has created significant bottlenecks in knowledge extraction, synthesis, and scientific reasoning. Large Language Models (LLMs) offer unprecedented opportunities to accelerate materials research through automated analysis and prediction. Still, their effective deployment requires domain-specific adaptation for understanding and solving domain-relevant tasks. Here, we present LLaMat, a family of foundational models for materials science developed through continued pretraining of LLaMA models on an extensive corpus of materials literature and crystallographic data. Through systematic evaluation, we demonstrate that LLaMat excels in materials-specific NLP and structured information extraction while maintaining general linguistic capabilities. The specialized LLaMat-CIF variant demonstrates unprecedented capabilities in crystal structure generation, predicting stable crystals with high coverage across the periodic table. Intriguingly, despite LLaMA-3's superior performance in comparison to LLaMA-2, we observe that LLaMat-2 demonstrates unexpectedly enhanced domain-specific performance across diverse materials science tasks, including structured information extraction from text and tables, more particularly in crystal structure generation, a potential adaptation rigidity in overtrained LLMs. Altogether, the present work demonstrates the effectiveness of domain adaptation towards developing practically deployable LLM copilots for materials research. Beyond materials science, our findings reveal important considerations for domain adaptation of LLMs, such as model selection, training methodology, and domain-specific performance, which may influence the development of specialized scientific AI systems.
Language model driven: a PROTAC generation pipeline with dual constraints of structure and property
Shao, Jinsong, Gong, Qineng, Yin, Zeyu, Chen, Yu, Hao, Yajie, Zhang, Lei, Jiang, Linlin, Yao, Min, Li, Jinlong, Wang, Fubo, Wang, Li
The imperfect modeling of ternary complexes has limited the application of computer-aided drug discovery tools in PROTAC research and development. In this study, an AI-assisted approach for PROTAC molecule design pipeline named LM-PROTAC was developed, which stands for language model driven Proteolysis Targeting Chimera, by embedding a transformer-based generative model with dual constraints on structure and properties, referred to as the DCT. This study utilized the fragmentation representation of molecules and developed a language model driven pipeline. Firstly, a language model driven affinity model for protein compounds to screen molecular fragments with high affinity for the target protein. Secondly, structural and physicochemical properties of these fragments were constrained during the generation process to meet specific scenario requirements. Finally, a two-round screening of the preliminary generated molecules using a multidimensional property prediction model to generate a batch of PROTAC molecules capable of degrading disease-relevant target proteins for validation in vitro experiments, thus achieving a complete solution for AI-assisted PROTAC drug generation. Taking the tumor key target Wnt3a as an example, the LM-PROTAC pipeline successfully generated PROTAC molecules capable of inhibiting Wnt3a. The results show that DCT can efficiently generate PROTAC that targets and hydrolyses Wnt3a.
OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models
Sharma, Kartik, Kumar, Peeyush, Li, Yunqing
This paper presents OG-RAG, an Ontology-Grounded Retrieval Augmented Generation method designed to enhance LLM-generated responses by anchoring retrieval processes in domain-specific ontologies. While LLMs are widely used for tasks like question answering and search, they struggle to adapt to specialized knowledge, such as industrial workflows or knowledge work, without expensive fine-tuning or sub-optimal retrieval methods. Existing retrieval-augmented models, such as RAG, offer improvements but fail to account for structured domain knowledge, leading to suboptimal context generation. Ontologies, which conceptually organize domain knowledge by defining entities and their interrelationships, offer a structured representation to address this gap. OG-RAG constructs a hypergraph representation of domain documents, where each hyperedge encapsulates clusters of factual knowledge grounded using domain-specific ontology. An optimization algorithm then retrieves the minimal set of hyperedges that constructs a precise, conceptually grounded context for the LLM. This method enables efficient retrieval while preserving the complex relationships between entities. OG-RAG applies to domains where fact-based reasoning is essential, particularly in tasks that require workflows or decision-making steps to follow predefined rules and procedures. These include industrial workflows in healthcare, legal, and agricultural sectors, as well as knowledge-driven tasks such as news journalism, investigative research, consulting and more. Our evaluations demonstrate that OG-RAG increases the recall of accurate facts by 55% and improves response correctness by 40% across four different LLMs. Additionally, OG-RAG enables 30% faster attribution of responses to context and boosts fact-based reasoning accuracy by 27% compared to baseline methods.
Quantum Kernel-Based Long Short-term Memory for Climate Time-Series Forecasting
Hsu, Yu-Chao, Chen, Nan-Yow, Li, Tai-Yu, Po-Heng, null, Lee, null, Chen, Kuan-Cheng
--We present the Quantum Kernel-Based Long Short-T erm Memory (QK-LSTM) network, which integrates quantum kernel methods into classical LSTM architectures to enhance predictive accuracy and computational efficiency in climate time-series forecasting tasks, such as Air Quality Index (AQI) prediction. Leveraging quantum kernel methods allows for efficient computation of inner products in quantum spaces, addressing the computational challenges faced by classical models and variational quantum circuit-based models. Designed for the Noisy Intermediate-Scale Quantum (NISQ) era, QK-LSTM supports scalable hybrid quantum-classical implementations. Experimental results demonstrate that QK-LSTM outperforms classical LSTM networks in AQI forecasting, showcasing its potential for environmental monitoring and resource-constrained scenarios, while highlighting the broader applicability of quantum-enhanced machine learning frameworks in tackling large-scale, high-dimensional climate datasets. Climate time-series forecasting is essential for understanding and predicting environmental phenomena, which has significant implications for public health [1], resource management [2], and policy-making [3]. Accurate forecasting of climatic variables such as temperature, precipitation, and pollutant concentrations enables proactive measures to mitigate adverse effects associated with climate variability and change.
Dual Random Fields and their Application to Mineral Potential Mapping
In various geosciences branches, including mineral exploration, geometallurgical characterization on established mining operations, and remote sensing, the regionalized input variables are spatially well-sampled across the domain of interest, limiting the scope of spatial uncertainty quantification procedures. In turn, response outcomes such as the mineral potential in a given region, mining throughput, metallurgical recovery, or in-situ estimations from remote satellite imagery, are usually modeled from a much-restricted subset of testing samples, collected at certain locations due to accessibility restrictions and the high acquisition costs. Our limited understanding of these functions, in terms of the multi-dimensional complexity of causalities and unnoticed dependencies on inaccessible inputs, may lead to observing changes in such functions based on their geographical location. Pooling together different response functions across the domain is critical to correctly predict outcome responses, the uncertainty associated with these inferred values, and the significance of inputs in such predictions at unexplored areas. This paper introduces the notion of a dual random field (dRF), where the response function itself is considered a regionalized variable. In this way, different established response models across the geographic domain can be considered as observations of a dRF realization, enabling the spatial inference and uncertainty assessment of both response models and their predictions. We explain how dRFs inherit all the properties from classical random fields, allowing the use of standard Gaussian simulation procedures to simulate them. These models are combined to obtain a mineral potential response, providing an example of how to rigorously integrate machine learning approaches with geostatistics.
Combining knowledge graphs and LLMs for hazardous chemical information management and reuse
Da Silveira, Marcos, Deladiennee, Louis, Acem, Kheira, Freudenthal, Oona
Human health is increasingly threatened by exposure to hazardous substances, particularly persistent and toxic chemicals. The link between these substances, often encountered in complex mixtures, and various diseases are demonstrated in scientific studies. However, this information is scattered across several sources and hardly accessible by humans and machines. This paper evaluates current practices for publishing/accessing information on hazardous chemicals and proposes a novel platform designed to facilitate retrieval of critical chemical data in urgent situations. The platform aggregates information from multiple sources and organizes it into a structured knowledge graph. Users can access this information through a visual interface such as Neo4J Bloom and dashboards, or via natural language queries using a Chatbot. Our findings demonstrate a significant reduction in the time and effort required to access vital chemical information when datasets follow FAIR principles. Furthermore, we discuss the lessons learned from the development and implementation of this platform and provide recommendations for data owners and publishers to enhance data reuse and interoperability. This work aims to improve the accessibility and usability of chemical information by healthcare professionals, thereby supporting better health outcomes and informed decision-making in the face of patients exposed to chemical intoxication risks.
Optimization-Driven Design of Monolithic Soft-Rigid Grippers
Mansueto, Pierluigi, Dragusanu, Mihai, Saeed, Anjum, Malvezzi, Monica, Lapucci, Matteo, Salvietti, Gionata
Sim-to-real transfer remains a significant challenge in soft robotics due to the unpredictability introduced by common manufacturing processes such as 3D printing and molding. These processes often result in deviations from simulated designs, requiring multiple prototypes before achieving a functional system. In this study, we propose a novel methodology to address these limitations by combining advanced rapid prototyping techniques and an efficient optimization strategy. Firstly, we employ rapid prototyping methods typically used for rigid structures, leveraging their precision to fabricate compliant components with reduced manufacturing errors. Secondly, our optimization framework minimizes the need for extensive prototyping, significantly reducing the iterative design process. The methodology enables the identification of stiffness parameters that are more practical and achievable within current manufacturing capabilities. The proposed approach demonstrates a substantial improvement in the efficiency of prototype development while maintaining the desired performance characteristics. This work represents a step forward in bridging the sim-to-real gap in soft robotics, paving the way towards a faster and more reliable deployment of soft robotic systems.
The 50 greatest innovations of 2024
In 1988, we launched the Best of What's New Awards. The original list highlighted "the very things that make our lives more comfortable, more rewarding, more exciting, and more fun," to quote then-Publisher Grant A. Burnett. Now, in 2024, we continue our decades-old tradition of honoring big ideas. We even see hints of our original honorees in this year's list: Sea-Doo and Ford made both lists, 36 years apart. We're proud to bring you promising innovations--from things that make life at home easier to literal out-of-this-world explorations. This is the Best of What's New 2024. Had you asked me at the beginning of 2024 what our best gadgets list would look like, I'd have guessed it would be filled with quirky AI-driven devices like the rabbit R1 or the Humane Ai Pin. "Now with AI" is a phrase that has dominated consumer electronics in the 2020s. These devices promised unadulterated access to the power of neural networks in ways that would seamlessly integrate into our lives without relying on phones or smart fridges. Then, the devices came out. The software is slow and buggy, and the hardware is clunky. Maybe the stand-alone AI device will still have its year, and we'll look back and chuckle at these humble beginnings. In reality, 2024's big breakthrough came from Apple in the form of its long-rumored Vision Pro headset. The device has its own hurdles to clear, but after just a few minutes of using it, it was clear that it's something different, important, and honestly pretty amazing. The list also includes Sony's innovative pro-grade camera, the most accessible drone we've ever used, and a no-fun phone--no fun in a good way, of course. Credible rumors of Apple's VR bounced around the gadget blogs and tech sites for nearly a decade. It was consumer tech's sasquatch in that people claimed to have seen it, but no one knew if it even existed. Then, the Vision Pro emerged from the proverbial forest in February with a surprising design and a massive 3,500 price tag. It also came toting a new R-series chip and a dedicated OS meant for spatial computing.