Goto

Collaborating Authors

 Materials


Formally Exploring Time-Series Anomaly Detection Evaluation Metrics

arXiv.org Artificial Intelligence

Undetected anomalies in time series can trigger catastrophic failures in safety-critical systems, such as chemical plant explosions or power grid outages. Although many detection methods have been proposed, their performance remains unclear because current metrics capture only narrow aspects of the task and often yield misleading results. We address this issue by introducing verifiable properties that formalize essential requirements for evaluating time-series anomaly detection. These properties enable a theoretical framework that supports principled evaluations and reliable comparisons. Analyzing 37 widely used metrics, we show that most satisfy only a few properties, and none satisfy all, explaining persistent inconsistencies in prior results. To close this gap, we propose LARM, a flexible metric that provably satisfies all properties, and extend it to ALARM, an advanced variant meeting stricter requirements.


Augmented Web Usage Mining and User Experience Optimization with CAWAL's Enriched Analytics Data

arXiv.org Artificial Intelligence

Understanding user behavior on the web is increasingly critical for optimizing user experience (UX). This study introduces Augmented Web Usage Mining (AWUM), a methodology designed to enhance web usage mining and improve UX by enriching the interaction data provided by CAWAL (Combined Application Log and Web Analytics), a framework for advanced web analytics. Over 1.2 million session records collected in one month (~8.5GB of data) were processed and transformed into enriched datasets. AWUM analyzes session structures, page requests, service interactions, and exit methods. Results show that 87.16% of sessions involved multiple pages, contributing 98.05% of total pageviews; 40% of users accessed various services and 50% opted for secure exits. Association rule mining revealed patterns of frequently accessed services, highlighting CAWAL's precision and efficiency over conventional methods. AWUM offers a comprehensive understanding of user behavior and strong potential for large-scale UX optimization.


A Standardized Benchmark for Machine-Learned Molecular Dynamics using Weighted Ensemble Sampling

arXiv.org Artificial Intelligence

The rapid evolution of molecular dynamics (MD) methods, including machine-learned dynamics, has outpaced the development of standardized tools for method validation. Objective comparison between simulation approaches is often hindered by inconsistent evaluation metrics, insufficient sampling of rare conformational states, and the absence of reproducible benchmarks. To address these challenges, we introduce a modular benchmarking framework that systematically evaluates protein MD methods using enhanced sampling analysis. Our approach uses weighted ensemble (WE) sampling via The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis (WESTPA), based on progress coordinates derived from Time-lagged Independent Component Analysis (TICA), enabling fast and efficient exploration of protein conformational space. The framework includes a flexible, lightweight propagator interface that supports arbitrary simulation engines, allowing both classical force fields and machine learning-based models. Additionally, the framework offers a comprehensive evaluation suite capable of computing more than 19 different metrics and visualizations across a variety of domains. We further contribute a dataset of nine diverse proteins, ranging from 10 to 224 residues, that span a variety of folding complexities and topologies. Each protein has been extensively simulated at 300K for one million MD steps per starting point (4 ns). To demonstrate the utility of our framework, we perform validation tests using classic MD simulations with implicit solvent and compare protein conformational sampling using a fully trained versus under-trained CGSchNet model. By standardizing evaluation protocols and enabling direct, reproducible comparisons across MD approaches, our open-source platform lays the groundwork for consistent, rigorous benchmarking across the molecular simulation community.


From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is reshaping scientific discovery, evolving from specialized computational tools into autonomous research partners. We position Agentic Science as a pivotal stage within the broader AI for Science paradigm, where AI systems progress from partial assistance to full scientific agency. Enabled by large language models (LLMs), multimodal systems, and integrated research platforms, agentic AI shows capabilities in hypothesis generation, experimental design, execution, analysis, and iterative refinement -- behaviors once regarded as uniquely human. This survey provides a domain-oriented review of autonomous scientific discovery across life sciences, chemistry, materials science, and physics. We unify three previously fragmented perspectives -- process-oriented, autonomy-oriented, and mechanism-oriented -- through a comprehensive framework that connects foundational capabilities, core processes, and domain-specific realizations. Building on this framework, we (i) trace the evolution of AI for Science, (ii) identify five core capabilities underpinning scientific agency, (iii) model discovery as a dynamic four-stage workflow, (iv) review applications across the above domains, and (v) synthesize key challenges and future opportunities. This work establishes a domain-oriented synthesis of autonomous scientific discovery and positions Agentic Science as a structured paradigm for advancing AI-driven research.


China's rare earth restrictions aim to beat U.S. at its own game

The Japan Times

China's rare earth restrictions aim to beat U.S. at its own game Magnetic slices made from rare earth metals. Beijing last week announced a sweeping set of rules that are set to restrict the flow of rare earths worldwide. WASHINGTON - Over the past three years, Washington has claimed broad power to impose global rules that bar companies anywhere in the world from sending cutting-edge computer chips or the tools needed to make them to China. U.S. officials have argued that approach is necessary to make sure China does not gain the upper hand in the race for advanced artificial intelligence. But a sweeping set of restrictions announced by Beijing last week showed that two can play that game.


Breaking scaling relations with inverse catalysts: a machine learning exploration of trends in $\mathrm{CO_2}$ hydrogenation energy barriers

arXiv.org Artificial Intelligence

The conversion of $\mathrm{CO_2}$ into useful products such as methanol is a key strategy for abating climate change and our dependence on fossil fuels. Developing new catalysts for this process is costly and time-consuming and can thus benefit from computational exploration of possible active sites. However, this is complicated by the complexity of the materials and reaction networks. Here, we present a workflow for exploring transition states of elementary reaction steps at inverse catalysts, which is based on the training of a neural network-based machine learning interatomic potential. We focus on the crucial formate intermediate and its formation over nanoclusters of indium oxide supported on Cu(111). The speedup compared to an approach purely based on density functional theory allows us to probe a wide variety of active sites found at nanoclusters of different sizes and stoichiometries. Analysis of the obtained set of transition state geometries reveals different structure--activity trends at the edge or interior of the nanoclusters. Furthermore, the identified geometries allow for the breaking of linear scaling relations, which could be a key underlying reason for the excellent catalytic performance of inverse catalysts observed in experiments.


Element2Vec: Build Chemical Element Representation from Text for Property Prediction

arXiv.org Artificial Intelligence

Accurate property data for chemical elements is crucial for materials design and manufacturing, but many of them are difficult to measure directly due to equipment constraints. While traditional methods use the properties of other elements or related properties for prediction via numerical analyses, they often fail to model complex relationships. After all, not all characteristics can be represented as scalars. Recent efforts have been made to explore advanced AI tools such as language models for property estimation, but they still suffer from hallucinations and a lack of interpretability. In this paper, we investigate Element2Vecto effectively represent chemical elements from natural languages to support research in the natural sciences. Given the text parsed from Wikipedia pages, we use language models to generate both a single general-purpose embedding (Global) and a set of attribute-highlighted vectors (Local). Despite the complicated relationship across elements, the computational challenges also exist because of 1) the discrepancy in text distribution between common descriptions and specialized scientific texts, and 2) the extremely limited data, i.e., with only 118 known elements, data for specific properties is often highly sparse and incomplete. Thus, we also design a test-time training method based on self-attention to mitigate the prediction error caused by Vanilla regression clearly. We hope this work could pave the way for advancing AI-driven discovery in materials science.


Learning to Capture Rocks using an Excavator: A Reinforcement Learning Approach with Guiding Reward Formulation

arXiv.org Artificial Intelligence

Rock capturing with standard excavator buckets is a challenging task typically requiring the expertise of skilled operators. Unlike soil digging, it involves manipulating large, irregular rocks in unstructured environments where complex contact interactions with granular material make model-based control impractical. Existing autonomous excavation methods focus mainly on continuous media or rely on specialized grippers, limiting their applicability to real-world construction sites. This paper introduces a fully data-driven control framework for rock capturing that eliminates the need for explicit modeling of rock or soil properties. Robustness is enhanced through extensive domain randomization of rock geometry, density, and mass, as well as the initial configurations of the bucket, rock, and goal position. To the best of our knowledge, this is the first study to develop and evaluate an RL-based controller for the rock capturing task. Experimental results show that the policy generalizes well to unseen rocks and varying soil conditions, achieving high success rates comparable to those of human participants while maintaining machine stability. Corresponding author Email address: amirmasoud.molaei@tuni.fi Keywords: Excavators, Automatic rock capturing, Reinforcement learning, High-fidelity simulation, Guiding Reward Formulation, Non-prehensile manipulation 1. Introduction Autonomous excavation holds a great promise in addressing increasing demands of the mining and construction industries, two of the largest and most essential sectors worldwide. The excavator is one of the most widely used and versatile heavy-duty mobile machines (HDMMs), which is typically operated through a hydraulic system. Excavators are utilized for a wide range of earth-moving tasks, including digging, trenching, grading, and in particular material handling. Despite their versatility, traditional manual operation of excavators can result in low efficiency, increased physical strain on operators, and exposure to hazardous environments like open-pit mines. These challenges underscore the need for automation to enhance safety and productivity. An excavator is primarily composed of three major components, the traveling body, swing body, and the front digging manipulator. The digging manipulator, includes three main parts, boom, arm, and bucket, which are actuated by hydraulic cylinders. Additionally, joints connect the swing body, boom, arm, and bucket, allowing for flexible and precise motion [1, 2, 3, 4].


Design of Paper Robot Building Kits

arXiv.org Artificial Intelligence

Building robots is an engaging activity that provides opportunities for hands-on learning. However, traditional robot-building kits are usually costly with limited functionality due to material and technology constraints. To improve the accessibility and flexibility of such kits, we take paper as the building material and extensively explore the versatility of paper-based interactions. Based on an analysis of current robot-building kits and paper-based interaction research, we propose a design space for devising paper robots. We also analyzed our building kit designs using this design space, where these kits demonstrate the potential of paper as a cost-effective material for robot building. As a starting point, our design space and building kit examples provide a guideline that inspires and informs future research and development of novel paper robot-building kits.


Coder as Editor: Code-driven Interpretable Molecular Optimization

arXiv.org Artificial Intelligence

Molecular optimization is a central task in drug discovery that requires precise structural reasoning and domain knowledge. While large language models (LLMs) have shown promise in generating high-level editing intentions in natural language, they often struggle to faithfully execute these modifications-particularly when operating on non-intuitive representations like SMILES. We introduce MECo, a framework that bridges reasoning and execution by translating editing actions into executable code. MECo reformulates molecular optimization for LLMs as a cascaded framework: generating human-interpretable editing intentions from a molecule and property goal, followed by translating those intentions into executable structural edits via code generation. Our approach achieves over 98% accuracy in reproducing held-out realistic edits derived from chemical reactions and target-specific compound pairs. On downstream optimization benchmarks spanning physicochemical properties and target activities, MECo substantially improves consistency by 38-86 percentage points to 90%+ and achieves higher success rates over SMILES-based baselines while preserving structural similarity. By aligning intention with execution, MECo enables consistent, controllable and interpretable molecular design, laying the foundation for high-fidelity feedback loops and collaborative human-AI workflows in drug discovery.