Materials
The Loss of Control Playbook: Degrees, Dynamics, and Preparedness
Stix, Charlotte, Hallensleben, Annika, Ortega, Alejandro, Pistillo, Matteo
This research report addresses the absence of an actionable definition for Loss of Control (LoC) in AI systems by developing a novel taxonomy and preparedness framework. Despite increasing policy and research attention, existing LoC definitions vary significantly in scope and timeline, hindering effective LoC assessment and mitigation. To address this issue, we draw from an extensive literature review and propose a graded LoC taxonomy, based on the metrics of severity and persistence, that distinguishes between Deviation, Bounded LoC, and Strict LoC. We model pathways toward a societal state of vulnerability in which sufficiently advanced AI systems have acquired or could acquire the means to cause Bounded or Strict LoC once a catalyst, either misalignment or pure malfunction, materializes. We argue that this state becomes increasingly likely over time, absent strategic intervention, and propose a strategy to avoid reaching a state of vulnerability. Rather than focusing solely on intervening on AI capabilities and propensities potentially relevant for LoC or on preventing potential catalysts, we introduce a complementary framework that emphasizes three extrinsic factors: Deployment context, Affordances, and Permissions (the DAP framework). Compared to work on intrinsic factors and catalysts, this framework has the unfair advantage of being actionable today. Finally, we put forward a plan to maintain preparedness and prevent the occurrence of LoC outcomes should a state of societal vulnerability be reached, focusing on governance measures (threat modeling, deployment policies, emergency response) and technical controls (pre-deployment testing, control measures, monitoring) that could maintain a condition of perennial suspension.
A self-driving lab for solution-processed electrochromic thin films
Dahms, Selma, Torresi, Luca, Bandesha, Shahbaz Tareq, Hansmann, Jan, Rรถhm, Holger, Colsmann, Alexander, Schott, Marco, Friederich, Pascal
Solution-processed electrochromic materials offer high potential for energy-efficient smart windows and displays. Their performance varies with material choice and processing conditions. Electrochromic thin film electrodes require a smooth, defect-free coating for optimal contrast between bleached and colored states. The complexity of optimizing the spin-coated electrochromic thin layer poses challenges for rapid development. This study demonstrates the use of self-driving laboratories to accelerate the development of electrochromic coatings by coupling automation with machine learning. Our system combines automated data acquisition, image processing, spectral analysis, and Bayesian optimization to explore processing parameters efficiently. This approach not only increases throughput but also enables a pointed search for optimal processing parameters. The approach can be applied to various solution-processed materials, highlighting the potential of self-driving labs in enhancing materials discovery and process optimization.
Accelerating Materials Discovery: Learning a Universal Representation of Chemical Processes for Cross-Domain Property Prediction
Tsitsvero, Mikhail, Nakao, Atsuyuki, Ikebata, Hisaki
Experimental validation of chemical processes is slow and costly, limiting exploration in materials discovery. Machine learning can prioritize promising candidates, but existing data in patents and literature is heterogeneous and difficult to use. We introduce a universal directed-tree process-graph representation that unifies unstructured text, molecular structures, and numeric measurements into a single machine-readable format. To learn from this structured data, we developed a multi-modal graph neural network with a property-conditioned attention mechanism. Trained on approximately 700,000 process graphs from nearly 9,000 diverse documents, our model learns semantically rich embeddings that generalize across domains. When fine-tuned on compact, domain-specific datasets, the pretrained model achieves strong performance, demonstrating that universal process representations learned at scale transfer effectively to specialized prediction tasks with minimal additional data.
Swiss startup turns urine into plant fertilizer
The space-inspired wastewater treatment uses the nutrients and loses the odor. Breakthroughs, discoveries, and DIY tips sent every weekday. When most people need to go number one, they find the nearest bathroom and don't give half a thought to what happens to their pee once it disappears down the toilet or urinal . It turns out that the nitrogen in human urine can be used in fertilizer. However, humanity's use of nitrogen is everything but efficient, according to a pair of siblings who founded the Swiss start-up company, VunaNexus.
Machine-learning-enabled interpretation of tribological deformation patterns in large-scale MD data
Ehrich, Hendrik J., May, Marvin C., Eder, Stefan J.
Conventional Data Processing Workflow Conventional MD analysis, which has been used in previous data evaluation [2, 32, 33] and can serve labeling and validation purposes for ML model construction and preparation, employs a multi-tiered data distillation process to derive robust trends, see Figure 1. In the left column of this figure, we show representative examples of computational tomographs through the 3D MD model, with the atoms colored by (a) grain orientation in electron backscatter diffraction (EBSD) standard, (b) lattice type, grain boundaries, and defects, (c) advection (drift) velocity to visualize shearing, and (d) local stresses. As a first step in the data distillation process, these 3D data that are stored for each atom are averaged across the lateral system dimensions, revealing depth-resolved, time-dependent quantities of interest, as visualized in the heat map at the top of the middle column (e). Further elimination of the sample depth and time dimensions leads to time-resolved global quantities (f) and contact pressure dependent trends (g), which can be fitted with characteristic pressures that mark the transition between deformation patterns (h). As an outlook to the utility of such highly distilled data, we propose their incorporation into Ashby-style charts, as schematically shown in Figure 1 (i), which link material properties with tribological properties. This conventional approach 2 accommodates the complexities of polycrystalline materials under tribological loading conditions and is guided by the underlying physics, resulting in this time-consuming procedure. Thus, substituting this approach with a well-trained ML model is highly relevant. The conventional approach can serve as the ground truth for training this ML model or to refine and validate said model based on newly generated MD data.
Teaching Language Models Mechanistic Explainability Through Arrow-Pushing
Neukomm, Thรฉo A., Jonฤev, Zlatko, Schwaller, Philippe
Chemical reaction mechanisms provide crucial insight into synthesizability, yet current Computer-Assisted Synthesis Planning (CASP) systems lack mechanistic grounding. We introduce a computational framework for teaching language models to predict chemical reaction mechanisms through arrow pushing formalism, a century-old notation that tracks electron flow while respecting conservation laws. We developed MechSMILES, a compact textual format encoding molecular structure and electron flow, and trained language models on four mechanism prediction tasks of increasing complexity using mechanistic reaction datasets, such as mech-USPTO-31k and FlowER. Our models achieve more than 95\% top-3 accuracy on elementary step prediction and scores that surpass 73\% on mech-USPTO-31k, and 93\% on FlowER dataset for the retrieval of complete reaction mechanisms on our hardest task. This mechanistic understanding enables three key applications. First, our models serve as post-hoc validators for CASP systems, filtering chemically implausible transformations. Second, they enable holistic atom-to-atom mapping that tracks all atoms, including hydrogens. Third, they extract catalyst-aware reaction templates that distinguish recycled catalysts from spectator species. By grounding predictions in physically meaningful electron moves that ensure conservation of mass and charge, this work provides a pathway toward more explainable and chemically valid computational synthesis planning, while providing an architecture-agnostic framework for the benchmarking of mechanism prediction.
Retro-Expert: Collaborative Reasoning for Interpretable Retrosynthesis
Li, Xinyi, Wang, Sai, Lin, Yutian, Wu, Yu, Yang, Yi
Retrosynthesis prediction aims to infer the reactant molecule based on a given product molecule, which is a fundamental task in chemical synthesis. However, existing models rely on static pattern-matching paradigm, which limits their ability to perform effective logic decision-making, leading to black-box decision-making. Building on this, we propose Retro-Expert, an interpretable retrosyn-thesis framework that performs collaborative reasoning by combining the complementary reasoning strengths of Large Language Models and specialized models via reinforcement learning. It outputs natural language explanations grounded in chemical logic through three components: (1) specialized models analyze the product to construct high-quality chemical decision space, (2) LLM-driven critical reasoning to generate predictions and corresponding interpretable reasoning path, and (3) reinforcement learning optimizing interpretable decision policy. Experiments show that Retro-Expert not only surpasses both LLM-based and specialized models across different metrics but also provides expert-aligned explanations that bridge the gap between AI predictions and actionable chemical insights.
Deep sea mining test uncovered multiple new species
One of the first studies of its kind also showed mining's stark effects on the abyssal plain. Breakthroughs, discoveries, and DIY tips sent every weekday. Researchers completing one of the largest impact studies on the potential environmental impacts of deep-sea mining found a bit more than they bargained for on the ocean floor: 4,350 animals, each at least larger than 0.3 millimeters. From these, they ultimately identified 788 separate species of unique crustaceans, mollusks, marine bristle worms, and other creatures living in this sought after mining zone. While the team found that harvesting rare earth metals from over 13,000 feet below the ocean's surface may not be as destructive as initially theorized, the disruptions are still cause for serious concerns.
I hope Crucial's death isn't a canary in a PC memory coal mine
When you purchase through links in our articles, we may earn a small commission. I hope Crucial's death isn't a canary in a PC memory coal mine I'm now wondering what comes next. I did not have "Micron kills its consumer business" on my 2025 bingo card. The company announced the shuttering of its Crucial brand on Wednesday morning in unexpectedly simple, transparent language . The short version: Micron is concentrating on their business customers, where the demand has "surged" for memory and storage--thanks to data centers and their scaling up for AI.
ResponsibleRobotBench: Benchmarking Responsible Robot Manipulation using Multi-modal Large Language Models
Zhang, Lei, Dong, Ju, Bai, Kaixin, Ni, Minheng, Marton, Zoltan-Csaba, Chen, Zhaopeng, Zhang, Jianwei
Recent advances in large multimodal models have enabled new opportunities in embodied AI, particularly in robotic manipulation. These models have shown strong potential in generalization and reasoning, but achieving reliable and responsible robotic behavior in real-world settings remains an open challenge. In high-stakes environments, robotic agents must go beyond basic task execution to perform risk-aware reasoning, moral decision-making, and physically grounded planning. We introduce ResponsibleRobotBench, a systematic benchmark designed to evaluate and accelerate progress in responsible robotic manipulation from simulation to real world. This benchmark consists of 23 multi-stage tasks spanning diverse risk types, including electrical, chemical, and human-related hazards, and varying levels of physical and planning complexity. These tasks require agents to detect and mitigate risks, reason about safety, plan sequences of actions, and engage human assistance when necessary. Our benchmark includes a general-purpose evaluation framework that supports multimodal model-based agents with various action representation modalities. The framework integrates visual perception, context learning, prompt construction, hazard detection, reasoning and planning, and physical execution. It also provides a rich multimodal dataset, supports reproducible experiments, and includes standardized metrics such as success rate, safety rate, and safe success rate. Through extensive experimental setups, ResponsibleRobotBench enables analysis across risk categories, task types, and agent configurations. By emphasizing physical reliability, generalization, and safety in decision-making, this benchmark provides a foundation for advancing the development of trustworthy, real-world responsible dexterous robotic systems. https://sites.google.com/view/responsible-robotbench