Goto

Collaborating Authors

 Materials


Why 23andMe's Genetic Data Could Be a 'Gold Mine' for AI Companies

TIME - Tech

But any AI-related company attempting to acquire 23andMe would run significant reputational risks. Many people are horrified by the thought that they surrendered their genetic data to trace their ancestry, only for it to now be potentially used in ways they never consented to. "Anybody touching this data is running a risk," Kumar, who is the director of Fox's Center for Business Analytics and Disruptive Technologies, says. "But at the same time, not touching it, they might be losing on something big as well." What Does That Mean For Your Account?


Decoupled Dynamics Framework with Neural Fields for 3D Spatio-temporal Prediction of Vehicle Collisions

arXiv.org Artificial Intelligence

This study proposes a neural framework that predicts 3D vehicle collision dynamics by independently modeling global rigid-body motion and local structural deformation. Unlike approaches directly predicting absolute displacement, this method explicitly separates the vehicle's overall translation and rotation from its structural deformation. Two specialized networks form the core of the framework: a quaternion-based Rigid Net for rigid motion and a coordinate-based Deformation Net for local deformation. By independently handling fundamentally distinct physical phenomena, the proposed architecture achieves accurate predictions without requiring separate supervision for each component. The model, trained on only 10% of available simulation data, significantly outperforms baseline models, including single multi-layer perceptron (MLP) and deep operator networks (DeepONet), with prediction errors reduced by up to 83%. Extensive validation demonstrates strong generalization to collision conditions outside the training range, accurately predicting responses even under severe impacts involving extreme velocities and large impact angles. Furthermore, the framework successfully reconstructs high-resolution deformation details from low-resolution inputs without increased computational effort. Consequently, the proposed approach provides an effective, computationally efficient method for rapid and reliable assessment of vehicle safety across complex collision scenarios, substantially reducing the required simulation data and time while preserving prediction fidelity.


The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs

arXiv.org Artificial Intelligence

Coral reefs are declining worldwide due to climate change and local stressors. To inform effective conservation or restoration, monitoring at the highest possible spatial and temporal resolution is necessary. Conventional coral reef surveying methods are limited in scalability due to their reliance on expert labor time, motivating the use of computer vision tools to automate the identification and abundance estimation of live corals from images. However, the design and evaluation of such tools has been impeded by the lack of large high quality datasets. We release the Coralscapes dataset, the first general-purpose dense semantic segmentation dataset for coral reefs, covering 2075 images, 39 benthic classes, and 174k segmentation masks annotated by experts. Coralscapes has a similar scope and the same structure as the widely used Cityscapes dataset for urban scene segmentation, allowing benchmarking of semantic segmentation models in a new challenging domain which requires expert knowledge to annotate. We benchmark a wide range of semantic segmentation models, and find that transfer learning from Coralscapes to existing smaller datasets consistently leads to state-of-the-art performance. Coralscapes will catalyze research on efficient, scalable, and standardized coral reef surveying methods based on computer vision, and holds the potential to streamline the development of underwater ecological robotics.


Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch Scheduling

arXiv.org Artificial Intelligence

--Mining process optimization, particularly truck dispatch scheduling, is a critical factor in enhancing the efficiency of open-pit mining operations. However, the dynamic and stochastic nature of mining environments--characterized by uncertainties such as equipment failures, truck maintenance, and variable haul cycle times--poses significant challenges for traditional optimization methods. While Reinforcement Learning (RL) has demonstrated promise in adaptive decision-making for mining logistics, its practical deployment requires rigorous evaluation in realistic and customizable simulation environments. T o address this challenge, we introduce Mining-Gym, a configurable, open-source benchmarking environment designed for training, testing, and comparing RL algorithms in mining process optimization. Built on Discrete Event Simulation (DES) and seamlessly integrated with the OpenAI Gym interface, Mining-Gym offers a structured testbed that enables the direct application of advanced RL algorithms from Stable Baselines. The framework models key mining-specific uncertainties, such as equipment failures, queue congestion, and stochasticity of mining processes, ensuring a realistic and adaptive learning environment. Additionally, a graphic user interface (GUI) for easy parameter selection for mine-site configuration, comprehensive data logging system, a built-in KPI dashboard and real-time representative visualization of mine-site enables in-depth performance analysis, facilitating standardized, reproducible evaluation across multiple RL strategies and baseline heuristics. INING process optimization aims to enhance efficiency and productivity by improving resource allocation, equipment scheduling, and material handling. However, these operations are highly complex, influenced by dynamic factors such as equipment failures, fluctuating ore quality, and unpredictable environmental conditions. Traditional optimization methods, such as linear programming and heuristics, struggle to adapt in real time, leading to inefficiencies and increased costs.


Equivariant Machine Learning Interatomic Potentials with Global Charge Redistribution

arXiv.org Artificial Intelligence

Machine learning interatomic potentials (MLIPs) provide a computationally efficient alternative to quantum mechanical simulations for predicting material properties. Message-passing graph neural networks, commonly used in these MLIPs, rely on local descriptor-based symmetry functions to model atomic interactions. However, such local descriptor-based approaches struggle with systems exhibiting long-range interactions, charge transfer, and compositional heterogeneity. In this work, we develop a new equivariant MLIP incorporating long-range Coulomb interactions through explicit treatment of electronic degrees of freedom, specifically global charge distribution within the system. This is achieved using a charge equilibration scheme based on predicted atomic electronegativities. We systematically evaluate our model across a range of benchmark periodic and non-periodic datasets, demonstrating that it outperforms both short-range equivariant and long-range invariant MLIPs in energy and force predictions. Our approach enables more accurate and efficient simulations of systems with long-range interactions and charge heterogeneity, expanding the applicability of MLIPs in computational materials science.


Staying Alive: Online Neural Network Maintenance and Systemic Drift

arXiv.org Artificial Intelligence

We present the Subset Extended Kalman Filter (SEKF) as a method to update previously trained model weights online rather than retraining or finetuning them when the system a model represents drifts away from the conditions under which it was trained. We identify the parameters to be updated using the gradient of the loss function and use the SEKF to update only these parameters. We compare finetuning and SEKF for online model maintenance in the presence of systemic drift through four dynamic regression case studies and find that the SEKF is able to maintain model accuracy as-well if not better than finetuning while requiring significantly less time per iteration, and less hyperparameter tuning.


Building Resource-Constrained Language Agents: A Korean Case Study on Chemical Toxicity Information

arXiv.org Artificial Intelligence

Language agents powered by large language models (LLMs) face significant deployment challenges in resource-constrained environments, particularly for specialized domains and less-common languages. This paper presents Tox-chat, a Korean chemical toxicity information agent devised within these limitations. We propose two key innovations: a context-efficient architecture that reduces token consumption through hierarchical section search, and a scenario-based dialogue generation methodology that effectively distills tool-using capabilities from larger models. Experimental evaluations demonstrate that our fine-tuned 8B parameter model substantially outperforms both untuned models and baseline approaches, in terms of DB faithfulness and preference. Our work offers valuable insights for researchers developing domain-specific language agents under practical constraints.


Predicting performance-related properties of refrigerant based on tailored small-molecule functional group contribution

arXiv.org Artificial Intelligence

As current group contribution (GC) methods are mostly proposed for a wide size-range of molecules, applying them to property prediction of small refrigerant molecules could lead to unacceptable errors. In this sense, for the design of novel refrigerants and refrigeration systems, tailoring GC-based models specifically fitted to refrigerant molecules is of great interest. In this work, databases of potential refrigerant molecules are first collected, focusing on five key properties related to the operational efficiency of refrigeration systems, namely normal boiling point, critical temperature, critical pressure, enthalpy of vaporization, and acentric factor. Based on tailored small-molecule groups, the GC method is combined with machine learning (ML) to model these performance-related properties. Following the development of GC-ML models, their performance is analyzed to highlight the potential group-to-property contributions. Additionally, the refrigerant property databases are extended internally and externally, based on which examples are presented to highlight the significance of the developed models.


Accelerating and enhancing thermodynamic simulations of electrochemical interfaces

arXiv.org Artificial Intelligence

Electrochemical interfaces are crucial in catalysis, energy storage, and corrosion, where their stability and reactivity depend on complex interactions between the electrode, adsorbates, and electrolyte. Predicting stable surface structures remains challenging, as traditional surface Pourbaix diagrams tend to either rely on expert knowledge or costly $\textit{ab initio}$ sampling, and neglect thermodynamic equilibration with the environment. Machine learning (ML) potentials can accelerate static modeling but often overlook dynamic surface transformations. Here, we extend the Virtual Surface Site Relaxation-Monte Carlo (VSSR-MC) method to autonomously sample surface reconstructions modeled under aqueous electrochemical conditions. Through fine-tuning foundational ML force fields, we accurately and efficiently predict surface energetics, recovering known Pt(111) phases and revealing new LaMnO$_\mathrm{3}$(001) surface reconstructions. By explicitly accounting for bulk-electrolyte equilibria, our framework enhances electrochemical stability predictions, offering a scalable approach to understanding and designing materials for electrochemical applications.


Feature Selection Based on Reinforcement Learning and Hazard State Classification for Magnetic Adhesion Wall-Climbing Robots

arXiv.org Artificial Intelligence

Abstract: Magnetic adhesion tracked wall-climbing robots face potential risks of overturning during high-altitude operations, making their stability crucial for ensuring safety. This study presents a dynamic feature selection method based on Proximal Policy Optimization (PPO) reinforcement learning, combined with typical machine learning models, aimed at improving the classification accuracy of hazardous states under complex operating conditions. Firstly, this work innovatively employs a fiber rod-based MEMS attitude sensor to collect vibration data from the robot and extract high-dimensional feature vectors in both time and frequency domains. Then, a reinforcement learning model is used to dynamically select the optimal feature subset, reducing feature redundancy and enhancing classification accuracy. Finally, a CNN-LSTM deep learning model is employed for classification and recognition. Experimental results demonstrate that the proposed method significantly improves the robot's ability to assess hazardous states across various operational scenarios, providing reliable technical support for robotic safety monitoring. Keywords: Magnetic Adhesion Wall-Climbing Robot, MEMS Sensor, Hazard State Evaluation, Reinforcement Learning, Feature Selection, Deep Learning 1. Introduction Magnetic adhesion tracked wall-climbing robots are designed specifically for vertical or inclined surfaces, enabling them to effectively counteract gravity and perform a variety of tasks [1], such as inspection, welding, and cleaning in high-altitude environments [2-5]. These robots have broad application prospects, particularly in dangerous high-altitude operations, where they can significantly improve work efficiency and ensure the safety of operators [6]. However, as the robot moves along the wall, the overturning torque generated by its weight and load may cause it to flip backward, affecting its stability and posing potential safety risks [7].