AITopics

2502.00363

Country:

North America > United States > Wisconsin (0.41)
North America > United States > Texas > Tarrant County > Arlington (0.14)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.68)

Industry:

Materials > Construction Materials (1.00)
Health & Medicine (1.00)
Water & Waste Management > Water Management > Water Supplies & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceJan-31-2025

MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

Chari, Anirudh, Reddy, Suraj, Tiwari, Aditya, Lian, Richard, Zhou, Brian

While large language models (LLMs) have shown promising capabilities as zero-shot planners for embodied agents, their inability to learn from experience and build persistent mental models limits their robustness in complex open-world environments like Minecraft. We introduce MINDSTORES, an experience-augmented planning framework that enables embodied agents to build and leverage mental models through natural interaction with their environment. Drawing inspiration from how humans construct and refine cognitive mental models, our approach extends existing zero-shot LLM planning by maintaining a database of past experiences that informs future planning iterations. The key innovation is representing accumulated experiences as natural language embeddings of (state, task, plan, outcome) tuples, which can then be efficiently retrieved and reasoned over by an LLM planner to generate insights and guide plan refinement for novel states and tasks. Through extensive experiments in the MineDojo environment, a simulation environment for agents in Minecraft that provides low-level controls for Minecraft, we find that MINDSTORES learns and applies its knowledge significantly better than existing memory-based LLM planners while maintaining the flexibility and generalization benefits of zero-shot approaches, representing an important step toward more capable embodied AI systems that can learn continuously through natural experience.

artificial intelligence, large language model, natural language, (18 more...)

2501.19318

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(4 more...)

Genre:

Research Report (0.83)
Workflow (0.69)

Industry:

Leisure & Entertainment > Games > Computer Games (0.96)
Materials > Metals & Mining > Iron (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Masinelli, Giulio, Rajani, Chang, Hoffmann, Patrik, Wasmer, Kilian, Atienza, David

Reinforcement Learning on Reconfigurable Hardware: Overcoming Material Variability in Laser Material Processing

arXiv.org Artificial IntelligenceJan-31-2025

Ensuring consistent processing quality is challenging in laser processes due to varying material properties and surface conditions. Although some approaches have shown promise in solving this problem via automation, they often rely on predetermined targets or are limited to simulated environments. To address these shortcomings, we propose a novel real-time reinforcement learning approach for laser process control, implemented on a Field Programmable Gate Array to achieve real-time execution. Our experimental results from laser welding tests on stainless steel samples with a range of surface roughnesses validated the method's ability to adapt autonomously, without relying on reward engineering or prior setup information. Specifically, the algorithm learned the correct power profile for each unique surface characteristic, demonstrating significant improvements over hand-engineered optimal constant power strategies -- up to 23% better performance on rougher surfaces and 7% on mixed surfaces. This approach represents a significant advancement in automating and optimizing laser processes, with potential applications across multiple industries.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2501.19102

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States (0.04)
North America > Canada (0.04)
Europe > Spain (0.04)

Genre: Research Report (0.82)

Industry: Materials (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Solé, Àlex, Mosella-Montoro, Albert, Cardona, Joan, Gómez-Coca, Silvia, Aravena, Daniel, Ruiz, Eliseo, Ruiz-Hidalgo, Javier

A Cartesian Encoding Graph Neural Network for Crystal Structures Property Prediction: Application to Thermal Ellipsoid Estimation

In diffraction-based crystal structure analysis, thermal ellipsoids, quantified via Anisotropic Displacement Parameters (ADPs), are critical yet challenging to determine. ADPs capture atomic vibrations, reflecting thermal and structural properties, but traditional computation is often expensive. This paper introduces CartNet, a novel graph neural network (GNN) for efficiently predicting crystal properties by encoding atomic geometry into Cartesian coordinates alongside the crystal temperature. CartNet integrates a neighbour equalization technique to emphasize covalent and contact interactions, and a Cholesky-based head to ensure valid ADP predictions. We also propose a rotational SO(3) data augmentation strategy during training to handle unseen orientations. An ADP dataset with over 200,000 experimental crystal structures from the Cambridge Structural Database (CSD) was curated to validate the approach. CartNet significantly reduces computational costs and outperforms existing methods in ADP prediction by 10.87%, while delivering a 34.77% improvement over theoretical approaches. We further evaluated CartNet on other datasets covering formation energy, band gap, total energy, energy above the convex hull, bulk moduli, and shear moduli, achieving 7.71% better results on the Jarvis Dataset and 13.16% on the Materials Project Dataset. These gains establish CartNet as a state-of-the-art solution for diverse crystal property predictions. Project website and online demo: https://www.ee.ub.edu/cartnet

artificial intelligence, deep learning, machine learning, (20 more...)

doi: 10.1039/D4DD00352G

2501.18369

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.66)

Industry:

Materials > Chemicals (0.46)
Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Sharma, Mrinank, Tong, Meg, Mu, Jesse, Wei, Jerry, Kruthoff, Jorrit, Goodfriend, Scott, Ong, Euan, Peng, Alwin, Agarwal, Raj, Anil, Cem, Askell, Amanda, Bailey, Nathan, Benton, Joe, Bluemke, Emma, Bowman, Samuel R., Christiansen, Eric, Cunningham, Hoagy, Dau, Andy, Gopal, Anjali, Gilson, Rob, Graham, Logan, Howard, Logan, Kalra, Nimit, Lee, Taesung, Lin, Kevin, Lofgren, Peter, Mosconi, Francesco, O'Hara, Clare, Olsson, Catherine, Petrini, Linda, Rajani, Samir, Saxena, Nikhil, Silverstein, Alex, Singh, Tanya, Sumers, Theodore, Tang, Leonard, Troy, Kevin K., Weisser, Constantin, Zhong, Ruiqi, Zhou, Giulio, Leike, Jan, Kaplan, Jared, Perez, Ethan

Large language models (LLMs) are vulnerable to universal jailbreaks--prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by prompting LLMs with natural language rules (i.e., a constitution) specifying permitted and restricted content. In over 3,000 estimated hours of red teaming, no red teamer found a universal jailbreak that could extract information from an early classifier-guarded LLM at a similar level of detail to an unguarded model across most target queries. On automated evaluations, enhanced classifiers demonstrated robust defense against held-out domain-specific jailbreaks. These classifiers also maintain deployment viability, with an absolute 0.38% increase in production-traffic refusals and a 23.7% inference overhead. Our work demonstrates that defending against universal jailbreaks while maintaining practical deployment viability is tractable.

classifier, large language model, machine learning, (19 more...)

2501.18837

Genre:

Workflow (1.00)
Questionnaire & Opinion Survey (0.92)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Military (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

QMe14S, A Comprehensive and Efficient Spectral Dataset for Small Organic Molecules

Yuan, Mingzhi, Zou, Zihan, Hu, Wei

Developing machine learning protocols for molecular simulations requires comprehensive and efficient datasets. Here we introduce the QMe14S dataset, comprising 186,102 small organic molecules featuring 14 elements (H, B, C, N, O, F, Al, Si, P, S, Cl, As, Se, Br) and 47 functional groups. Using density functional theory at the B3LYP/TZVP level, we optimized the geometries and calculated properties including energy, atomic charge, atomic force, dipole moment, quadrupole moment, polarizability, octupole moment, first hyperpolarizability, and Hessian. At the same level, we obtained the harmonic IR, Raman and NMR spectra. Furthermore, we conducted ab initio molecular dynamics simulations to generate dynamic configurations and extract nonequilibrium properties, including energy, forces, and Hessians. By leveraging our E(3)-equivariant message-passing neural network (DetaNet), we demonstrated that models trained on QMe14S outperform those trained on the previously developed QM9S dataset in simulating molecular spectra. The QMe14S dataset thus serves as a comprehensive benchmark for molecular simulations, offering valuable insights into structure-property relationships.

artificial intelligence, machine learning, molecule, (17 more...)

2501.18876

Country:

North America > United States > Connecticut > New Haven County > Wallingford (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.50)

Industry: Materials (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Feuer, Benjamin, Hegde, Chinmay

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Language model (LLM) post-training, from DPO to distillation, can refine behaviors and unlock new skills, but the open science supporting these post-training techniques is still in its infancy. One limiting factor has been the difficulty of conducting large-scale comparative analyses of synthetic data generating models and LLM judges. To close this gap, we introduce WILDCHAT-50M, the largest public chat dataset to date. We extend the existing WildChat dataset to include responses not only from GPT, but from over 50 different open-weight models, ranging in size from 0.5B to 104B parameters. We conduct an extensive comparative analysis and demonstrate the potential of this dataset by creating RE-WILD, our own public SFT mix, which outperforms the recent Tulu-3 SFT mixture from Allen AI with only 40% as many samples. Our dataset, samples and code are available at https://github.com/penfever/wildchat-50m.

large language model, machine learning, natural language, (20 more...)

2501.18511

Country:

Asia > Russia (0.46)
Asia > Japan (0.46)
Europe > United Kingdom (0.46)
(10 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Law > Taxation Law (1.00)
Government > Tax (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Nowacka, Anna, Schladitz, Katja, Grzesiak, Szymon, Pahn, Matthias

Segmentation of cracks in 3d images of fiber reinforced concrete using deep learning

Cracks in concrete structures are very common and are an integral part of this heterogeneous material. Characteristics of cracks induced by standardized tests yield valuable information about the tested concrete formulation and its mechanical properties. Observing cracks on the surface of the concrete structure leaves a wealth of structural information unused. Computed tomography enables looking into the sample without interfering or destroying the microstructure. The reconstructed tomographic images are 3d images, consisting of voxels whose gray values represent local X-ray absorption. In order to identify voxels belonging to the crack, so to segment the crack structure in the images, appropriate algorithms need to be developed. Convolutional neural networks are known to solve this type of task very well given enough and consistent training data. We adapted a 3d version of the well-known U-Net and trained it on semi-synthetic 3d images of real concrete samples equipped with simulated crack structures. Here, we explain the general approach. Moreover, we show how to teach the network to detect also real crack systems in 3d images of varying types of real concrete, in particular of fiber reinforced concrete.

artificial intelligence, fiber, machine learning, (19 more...)

2501.18405

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)
Europe > Germany > Berlin (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Materials > Construction Materials (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-29-2025

Learning Metal Microstructural Heterogeneity through Spatial Mapping of Diffraction Latent Space Features

Calvat, Mathieu, Bean, Chris, Anjaria, Dhruv, Park, Hyoungryul, Wang, Haoren, Vecchio, Kenneth, Stinville, J. C.

To leverage advancements in machine learning for metallic materials design and property prediction, it is crucial to develop a data-reduced representation of metal microstructures that surpasses the limitations of current physics-based discrete microstructure descriptors. This need is particularly relevant for metallic materials processed through additive manufacturing, which exhibit complex hierarchical microstructures that cannot be adequately described using the conventional metrics typically applied to wrought materials. Furthermore, capturing the spatial heterogeneity of microstructures at the different scales is necessary within such framework to accurately predict their properties. To address these challenges, we propose the physical spatial mapping of metal diffraction latent space features. This approach integrates (i) point diffraction data encoding via variational autoencoders or contrastive learning and (ii) the physical mapping of the encoded values. Together these steps offer a method offers a novel means to comprehensively describe metal microstructures. We demonstrate this approach on a wrought and additively manufactured alloy, showing that it effectively encodes microstructural information and enables direct identification of microstructural heterogeneity not directly possible by physics-based models. This data-reduced microstructure representation opens the application of machine learning models in accelerating metallic material design and accurately predicting their properties.

artificial intelligence, kikuchi pattern, machine learning, (16 more...)

2501.18064

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre: Research Report (0.82)

Industry: Materials (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Pak, Peter, Farimani, Amir Barati

AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing

arXiv.org Artificial IntelligenceJan-29-2025

In this work we investigate the ability of large language models to predict additive manufacturing defect regimes given a set of process parameter inputs. For this task we utilize a process parameter defect dataset to fine-tune a collection of models, titled AdditiveLLM, for the purpose of predicting potential defect regimes including Keyholing, Lack of Fusion, and Balling. We compare different methods of input formatting in order to gauge the model's performance to correctly predict defect regimes on our sparse Baseline dataset and our natural language Prompt dataset. The model displays robust predictive capability, achieving an accuracy of 93\% when asked to provide the defect regimes associated with a set of process parameters. The incorporation of natural language input further simplifies the task of process parameters selection, enabling users to identify optimal settings specific to their build.

large language model, machine learning, natural language, (16 more...)

2501.17784

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.53)

Industry:

Machinery > Industrial Machinery (0.73)
Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)