Materials
Blast Hole Seeking and Dipping -- The Navigation and Perception Framework in a Mine Site Inspection Robot
Liu, Liyang, Mihankhah, Ehsan, Wallace, Nathan, Martinez, Javier, Hill, Andrew J.
In open-pit mining, holes are drilled into the surface of the excavation site and detonated with explosives to facilitate digging. These blast holes need to be inspected internally to assess subsurface material types and drill quality, in order to significantly reduce downstream material handling costs. Manual hole inspection is slow and expensive, limited in its ability to capture the geometric and geological characteristics of holes. This has been the motivation for the development of our autonomous mine-site inspection robot - "DIPPeR". In this paper, the automation aspect of the project is explained. We present a robust perception and navigation framework that provides streamlined blasthole seeking, tracking and accurate down-hole sensor positioning. To address challenges in the surface mining environment, where GPS and odometry data are noisy without RTK correction, we adopt a proximity-based adaptive navigation approach, enabling the vehicle to dynamically adjust its operations based on detected target availability and localisation accuracy. For perception, we process LiDAR data to extract the cone-shaped volume of drill-waste above ground, then project the 3D cone points into a virtual depth image to form accurate 2D segmentation of hole regions. To ensure continuous target-tracking as the robot approaches the goal, our system automatically adjusts projection parameters to preserve consistent hole image appearance. At the vicinity of the hole, we apply least squares circle fitting with non-maximum candidate suppression to achieve accurate hole detection and collision-free down-hole sensor placement. We demonstrate the effectiveness of our navigation and perception system in both high-fidelity simulation environments and on-site field trials. A demonstration video is available at https://www.youtube.com/watch?v=fRNbcBcaSqE.
Auto-Regressive U-Net for Full-Field Prediction of Shrinkage-Induced Damage in Concrete
Gaynutdinova, Liya, Havlรกsek, Petr, Rokoลก, Ondลej, Hendriks, Fleur, Doลกkรกล, Martin
This paper introduces a deep learning approach for predicting time-dependent full-field damage in concrete. The study uses an auto-regressive U-Net model to predict the evolution of the scalar damage field in a unit cell given microstructural geometry and evolution of an imposed shrinkage profile. By sequentially using the predicted damage output as input for subsequent predictions, the model facilitates the continuous assessment of damage progression. Complementarily, a convolutional neural network (CNN) utilises the damage estimations to forecast key mechanical properties, including observed shrinkage and residual stiffness. The proposed dual-network architecture demonstrates high computational efficiency and robust predictive performance on the synthesised datasets. The approach reduces the computational load traditionally associated with full-field damage evaluations and is used to gain insights into the relationship between aggregate properties, such as shape, size, and distribution, and the effective shrinkage and reduction in stiffness. Ultimately, this can help to optimize concrete mix designs, leading to improved durability and reduced internal damage.
Grocery to General Merchandise: A Cross-Pollination Recommender using LLMs and Real-Time Cart Context
Kekuda, Akshay, Dandu, Murali Mohana Krishna, Lahiri, Rimita, Cai, Shiqin, Subramaniam, Sinduja, Korpeoglu, Evren, Achan, Kannan
Modern e-commerce platforms strive to enhance customer experience by providing timely and contextually relevant recommendations. However, recommending general merchandise to customers focused on grocery shopping -- such as pairing milk with a milk frother -- remains a critical yet under-explored challenge. This paper introduces a cross-pollination (XP) framework, a novel approach that bridges grocery and general merchandise cross-category recommendations by leveraging multi-source product associations and real-time cart context. Our solution employs a two-stage framework: (1) A candidate generation mechanism that uses co-purchase market basket analysis and LLM-based approach to identify novel item-item associations; and (2) a transformer-based ranker that leverages the real-time sequential cart context and optimizes for engagement signals such as add-to-carts. Offline analysis and online A/B tests show an increase of 36\% add-to-cart rate with LLM-based retrieval on the item page, and 15\% lift in add-to-cart using cart context-based ranker on the cart page. Our work contributes practical techniques for cross-category recommendations and broader insights for e-commerce systems.
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
Yang, Cheng, Lu, Jiaxuan, Wan, Haiyuan, Yu, Junchi, Qin, Feiwei
The chemical reaction recommendation is to select proper reaction condition parameters for chemical reactions, which is pivotal to accelerating chemical science. With the rapid development of large language models (LLMs), there is growing interest in leveraging their reasoning and planning capabilities for reaction condition recommendation. Despite their success, existing methods rarely explain the rationale behind the recommended reaction conditions, limiting their utility in high-stakes scientific workflows. In this work, we propose ChemMAS, a multi-agent system that reframes condition prediction as an evidence-based reasoning task. ChemMAS decomposes the task into mechanistic grounding, multi-channel recall, constraint-aware agentic debate, and rationale aggregation. Each decision is backed by interpretable justifications grounded in chemical knowledge and retrieved precedents. Experiments show that ChemMAS achieves 20-35% gains over domain-specific baselines and outperforms general-purpose LLMs by 10-15% in Top-1 accuracy, while offering falsifiable, human-trustable rationales, which establishes a new paradigm for explainable AI in scientific discovery.
Timber: Training-free Instruct Model Refining with Base via Effective Rank
Wu, Taiqiang, Yang, Runming, Liu, Tao, Wang, Jiahao, Xu, Zenan, Wong, Ngai
Post-training, which elicits a pretrained Base model into the corresponding Instruct model, is widely considered to be superficial. In this work, we first reinforce this hypothesis by providing novel quantitative evidence from the weight level that the effective rank (eRank) remains negligibly changed. However, this superficiality also suffers a critical trade-off, improving the exploitation capabilities at the cost of limiting its exploration. To tackle this issue, we propose Timber, a simple yet effective training-free method that enhances the exploration capability of the Instruct model while preserving its exploitation. The key insight is to partially revert Instruct towards the paired Base model by subtle yet targeted refinement of the weight deltas. Extensive experiments on Llama and Qwen series demonstrate that Timber consistently improves vanilla Instruct models, particularly on Pass@k performance. Our findings offer new insights into the post-training stage at the weight level and practical strategies to refine the Instruct model without training. Large Language Models (LLMs), such as Qwen3 (Y ang et al., 2025), Llama 3 (Grattafiori et al., 2024), and Deepseek R1 (Guo et al., 2025), have achieved superior success in Natural Language Process (NLP), especially in reasoning tasks (Huang & Chang, 2022). To train these LLMs, a Base model is first pretrained on huge amounts of data. After that, a post-training stage is applied to train an Instruct model, adapting supervised finetuning (SFT) and reinforcement learning (RL) to elicit alignment and reasoning ability (Y ang et al., 2025). The post-training stage tends to be superficial, i.e., post-training only utilizes the pattern contained in the Base model acquired during pre-training (Y ue et al., 2025; Zhou et al., 2023a; Y e et al., 2025; Muennighoff et al., 2025). In this paper, we investigate the Base and Instruct models through the lens of effective rank (eRank, (Roy & V etterli, 2007)), providing a novel weight-level perspective on the superficiality of post-training. As shown in Figure 1, the eRanks of corresponding linear layers from the Base and Instruct models are almost identical. We can find that post-training induces only negligible changes to the effective dimensionality, offering new supporting evidence from the weight level for its superficiality.
A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models
Jeung, Wonje, Yoon, Sangyeon, Cho, Yoonjun, Jeon, Dongjae, Shin, Sangwoo, Hong, Hyesoo, No, Albert
Diffusion large language models (dLLMs) enable any-order generation, but this flexibility enlarges the attack surface: harmful spans may appear at arbitrary positions, and template-based prefilling attacks such as DIJA bypass response-level refusals. We introduce A2D (Any-Order, Any-Step Defense), a token-level alignment method that aligns dLLMs to emit an [EOS] refusal signal whenever harmful content arises. By aligning safety directly at the token-level under randomized masking, A2D achieves robustness to both any-decoding-order and any-step prefilling attacks under various conditions. It also enables real-time monitoring: dLLMs may begin a response but automatically terminate if unsafe continuation emerges. On safety benchmarks, A2D consistently prevents the generation of harmful outputs, slashing DIJA success rates from over 80% to near-zero (1.3% on LLaDA-8B-Instruct, 0.0% on Dream-v0-Instruct-7B), and thresholded [EOS] probabilities allow early rejection, yielding up to 19.3x faster safe termination.
How to Make Large Language Models Generate 100% Valid Molecules?
Tao, Wen, Tang, Jing, Chan, Alvin, Hooi, Bryan, Bi, Baolong, Peng, Nanyun, Liu, Yuansheng, Wang, Yiwei
Molecule generation is key to drug discovery and materials science, enabling the design of novel compounds with specific properties. Large language models (LLMs) can learn to perform a wide range of tasks from just a few examples. However, generating valid molecules using representations like SMILES is challenging for LLMs in few-shot settings. In this work, we explore how LLMs can generate 100% valid molecules. We evaluate whether LLMs can use SELFIES, a representation where every string corresponds to a valid molecule, for valid molecule generation but find that LLMs perform worse with SELFIES than with SMILES. We then examine LLMs' ability to correct invalid SMILES and find their capacity limited. Finally, we introduce SmiSelf, a cross-chemical language framework for invalid SMILES correction. SmiSelf converts invalid SMILES to SELFIES using grammatical rules, leveraging SELFIES' mechanisms to correct the invalid SMILES. Experiments show that SmiSelf ensures 100% validity while preserving molecular characteristics and maintaining or even enhancing performance on other metrics. SmiSelf helps expand LLMs' practical applications in biomedicine and is compatible with all SMILES-based generative models. Code is available at https://github.com/wentao228/SmiSelf.
A Comparison of Surrogate Constitutive Models for Viscoplastic Creep Simulation of HT-9 Steel
Robbe, Pieterjan, Ruybalid, Andre, Hegde, Arun, Bonneville, Christophe, Najm, Habib N, Capolungo, Laurent, Safta, Cosmin
Mechanistic microstructure-informed constitutive models for the mechanical response of polycrystals are a cornerstone of computational materials science. However, as these models become increasingly more complex - often involving coupled differential equations describing the effect of specific deformation modes - their associated computational costs can become prohibitive, particularly in optimization or uncertainty quantification tasks that require numerous model evaluations. To address this challenge, surrogate constitutive models that balance accuracy and computational efficiency are highly desirable. Data-driven surrogate models, that learn the constitutive relation directly from data, have emerged as a promising solution. In this work, we develop two local surrogate models for the viscoplastic response of a steel: a piecewise response surface method and a mixture of experts model. These surrogates are designed to adapt to complex material behavior, which may vary with material parameters or operating conditions. The surrogate constitutive models are applied to creep simulations of HT-9 steel, an alloy of considerable interest to the nuclear energy sector due to its high tolerance to radiation damage, using training data generated from viscoplastic self-consistent (VPSC) simulations. We define a set of test metrics to numerically assess the accuracy of our surrogate models for predicting viscoplastic material behavior, and show that the mixture of experts model outperforms the piecewise response surface method in terms of accuracy.
ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection
Kim, Jeonghye, Rhee, Sojeong, Kim, Minbeom, Kim, Dohyung, Lee, Sangmook, Sung, Youngchul, Jung, Kyomin
Recent advances in LLM agents have largely built on reasoning backbones like ReAct, which interleave thought and action in complex environments. However, ReAct often produces ungrounded or incoherent reasoning steps, leading to misalignment between the agent's actual state and goal. Our analysis finds that this stems from ReAct's inability to maintain consistent internal beliefs and goal alignment, causing compounding errors and hallucinations. To address this, we introduce ReflAct, a novel backbone that shifts reasoning from merely planning next actions to continuously reflecting on the agent's state relative to its goal. By explicitly grounding decisions in states and enforcing ongoing goal alignment, ReflAct dramatically improves strategic reliability. This design delivers substantial empirical gains: ReflAct surpasses ReAct by 27.7% on average, achieving a 93.3% success rate in ALFWorld. Notably, ReflAct even outperforms ReAct with added enhancement modules (e.g., Reflexion, WKM), showing that strengthening the core reasoning backbone is key to reliable agent performance.