Goto

Collaborating Authors

 Materials


A journey through the hyper-political world of microchips

The Guardian

A small town in the Netherlands hosts the only factory that produces the only chip-making machines that generate a type of light found nowhere naturally on Earth: extreme ultraviolet, a light emitted by young stars in outer space. This light, known as EUV, is the only way to make one of the world's most valuable and important technologies at scale: cutting-edge semiconductor chips. The factory is forbidden from selling its EUV machines to China. Below we explain how the chips are made, why they have become the focus of the US-China trade wars, how Taiwan was drawn into the maelstrom, and what could come next. The answers take us from deep underground to outer space, from the dirtiest places in the world to the cleanest, from the hottest temperatures to the coldest, from man-made structures smaller than a virus to equipment so large it takes three planes to move, and finally, to a state in physics that is two opposites at the same time.


DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

arXiv.org Artificial Intelligence

Designing solutions for complex engineering challenges is crucial in human production activities. However, previous research in the retrieval-augmented generation (RAG) field has not sufficiently addressed tasks related to the design of complex engineering solutions. To fill this gap, we introduce a new benchmark, SolutionBench, to evaluate a system's ability to generate complete and feasible solutions for engineering problems with multiple complex constraints. To further advance the design of complex engineering solutions, we propose a novel system, SolutionRAG, that leverages the tree-based exploration and bi-point thinking mechanism to generate reliable solutions. Extensive experimental results demonstrate that SolutionRAG achieves state-of-the-art (SOTA) performance on the SolutionBench, highlighting its potential to enhance the automation and reliability of complex engineering solution design in real-world applications.


Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions

arXiv.org Artificial Intelligence

People naturally vary in their annotations for subjective questions and some of this variation is thought to be due to the person's sociodemographic characteristics. LLMs have also been used to label data, but recent work has shown that models perform poorly when prompted with sociodemographic attributes, suggesting limited inherent sociodemographic knowledge. Here, we ask whether LLMs can be trained to be accurate sociodemographic models of annotator variation. Using a curated dataset of five tasks with standardized sociodemographics, we show that models do improve in sociodemographic prompting when trained but that this performance gain is largely due to models learning annotator-specific behaviour rather than sociodemographic patterns. Across all tasks, our results suggest that models learn little meaningful connection between sociodemographics and annotation, raising doubts about the current use of LLMs for simulating sociodemographic variation and behaviour.


Forecasting Monthly Residential Natural Gas Demand Using Just-In-Time-Learning Modeling

arXiv.org Machine Learning

ABSTRACT Natural gas (NG) is relatively a clean source of energy, particularly compared to fossil fuels, and worldwide consumption of NG has been increasing almost linearly in the last two decades. A similar trend can also be seen in Turkey, while another similarity is the high dependence on impor ts for the continuous NG supply. It is crucial to accurately forecast future NG demand (NGD) in Turkey, especially, for import contracts; in this respect, forecasts of monthly NGD for the following year are of utmost importance. In the current study, the h istorical monthly NG consumption data between 2014 and 2024 provided by SOCAR, the local residential NG distribution company for two cities in Turkey, Bursa and Kayseri, was used to determine out - of - sample monthly NGD forecasts for a period of one year and nine months using various time series models, including SARIMA and ETS models, and a novel proposed machine learning method. The proposed method, named Just - in - Time - Learning - Gaussia n Process Regression (JITL - GPR), uses a novel feature representation for t he past NG demand values; instead of using past demand values as column - wise separate features, they are placed on a two - dimensional (2 - D) grid of year - month values. For each test point, a kernel function, tailored for the NGD predictions, is used in GPR t o predict the query point. Since a model is constructed separately for each test point, the proposed method is, indeed, an example of JITL. The JITL - GPR method is easy to use and optimize, and offers a reduction in forecast errors compared to traditional t ime series methods and a state - of - the - art combinat ion model; therefore, it is a promising tool for NGD forecasting in similar settings. INTRODUCTION In the last few decades, there has been a shift in energy sources from fossil fuels to cleaner energy sources, such as wind and solar energy, mainly due to environmental concerns and related government regulations . However, these latter sources are depend ent on w eather conditions and require integration with grid technologies for continuous power generation. Natural gas (NG), typically, consists of (up to) ~95% of methane and 2 - 2.5% ethane - hexane+, with the remain der consist ing of nitrogen, CO NG p ower plants are easy to build and highly reliable, mak ing them invaluable for "clean" energy production. On the other hand, m ost countries depend on imports to maintain t heir NG supplies, and there is a delicate balance between import s and domestic demand . S toring excess import ed gas above actual demand is difficult and would result in economic losses, while import ing less than actual demand could result in a nationwide sh ortage.


Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching

arXiv.org Artificial Intelligence

Efficient truck dispatching via Reinforcement Learning (RL) in open-pit mining is often hindered by reliance on complex reward engineering and value-based methods. This paper introduces Curriculum-inspired Adaptive Direct Policy Guidance, a novel curriculum learning strategy for policy-based RL to address these issues. We adapt Proximal Policy Optimization (PPO) for mine dispatching's uneven decision intervals using time deltas in Temporal Difference and Generalized Advantage Estimation, and employ a Shortest Processing Time teacher policy for guided exploration via policy regularization and adaptive guidance. Evaluations in OpenMines demonstrate our approach yields a 10% performance gain and faster convergence over standard PPO across sparse and dense reward settings, showcasing improved robustness to reward design. This direct policy guidance method provides a general and effective curriculum learning technique for RL-based truck dispatching, enabling future work on advanced architectures.


NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research

arXiv.org Artificial Intelligence

This paper presents the development and application of a Large Language Model Retrieval-Augmented Generation (LLM-RAG) system tailored for nanotechnology research. The system leverages the capabilities of a sophisticated language model to serve as an intelligent research assistant, enhancing the efficiency and comprehensiveness of literature reviews in the nanotechnology domain. Central to this LLM-RAG system is its advanced query backend retrieval mechanism, which integrates data from multiple reputable sources. The system retrieves relevant literature by utilizing Google Scholar's advanced search, and scraping open-access papers from Elsevier, Springer Nature, and ACS Publications. This multifaceted approach ensures a broad and diverse collection of up-to-date scholarly articles and papers. The proposed system demonstrates significant potential in aiding researchers by providing a streamlined, accurate, and exhaustive literature retrieval process, thereby accelerating research advancements in nanotechnology. The effectiveness of the LLM-RAG system is validated through rigorous testing, illustrating its capability to significantly reduce the time and effort required for comprehensive literature reviews, while maintaining high accuracy, query relevance and outperforming standard, publicly available LLMS.


Map Space Belief Prediction for Manipulation-Enhanced Mapping

arXiv.org Artificial Intelligence

Searching for objects in cluttered environments requires selecting efficient viewpoints and manipulation actions to remove occlusions and reduce uncertainty in object locations, shapes, and categories. In this work, we address the problem of manipulation-enhanced semantic mapping, where a robot has to efficiently identify all objects in a cluttered shelf. Although Partially Observable Markov Decision Processes~(POMDPs) are standard for decision-making under uncertainty, representing unstructured interactive worlds remains challenging in this formalism. To tackle this, we define a POMDP whose belief is summarized by a metric-semantic grid map and propose a novel framework that uses neural networks to perform map-space belief updates to reason efficiently and simultaneously about object geometries, locations, categories, occlusions, and manipulation physics. Further, to enable accurate information gain analysis, the learned belief updates should maintain calibrated estimates of uncertainty. Therefore, we propose Calibrated Neural-Accelerated Belief Updates (CNABUs) to learn a belief propagation model that generalizes to novel scenarios and provides confidence-calibrated predictions for unknown areas. Our experiments show that our novel POMDP planner improves map completeness and accuracy over existing methods in challenging simulations and successfully transfers to real-world cluttered shelves in zero-shot fashion.


Data-Driven and Theory-Guided Pseudo-Spectral Seismic Imaging Using Deep Neural Network Architectures

arXiv.org Artificial Intelligence

Full Waveform Inversion (FWI) reconstructs high-resolution subsurface models via multi-variate optimization but faces challenges with solver selection and data availability. Deep Learning (DL) offers a promising alternative, bridging data-driven and physics-based methods. While FWI in DL has been explored in the time domain, the pseudo-spectral approach remains underutilized, despite its success in classical FWI. This thesis integrates pseudo-spectral FWI into DL, formulating both data-driven and theory-guided approaches using Deep Neural Networks (DNNs) and Recurrent Neural Networks (RNNs). These methods were theoretically derived, tested on synthetic and Marmousi datasets, and compared with deterministic and time-domain approaches. Results show that data-driven pseudo-spectral DNNs outperform classical FWI in deeper and over-thrust regions due to their global approximation capability. Theory-guided RNNs yield greater accuracy, with lower error and better fault identification. While DNNs excel in velocity contrast recovery, RNNs provide superior edge definition and stability in shallow and deep sections. Beyond enhancing FWI performance, this research identifies broader applications of DL-based inversion and outlines future directions for these frameworks.


Graph Neural Networks embedded into Margules model for vapor-liquid equilibria prediction

arXiv.org Artificial Intelligence

Graph Neural Networks embedded into Margules model for vapor-liquid equilibria prediction Edgar Ivan Sanchez Medina a,, Kai Sundmacher a,b a Process Systems Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, Magdeburg, 39106, Saxony-Anhalt, Germany b Chair for Process Systems Engineering, Otto-von-Guericke University, Universit atsplatz 2, Magdeburg, 39106, Saxony-Anhalt, GermanyAbstract Predictive thermodynamic models are crucial for the early stages of product and process design. In this paper the performance of Graph Neural Networks (GNNs) embedded into a relatively simple excess Gibbs energy model, the extended Margules model, for predicting vapor-liquid equilibrium is analyzed. By comparing its performance against the established UNIFAC-Dortmund model it has been shown that GNNs embedded in Margules achieves an overall lower accuracy. However, higher accuracy is observed in the case of various types of binary mixtures. Moreover, since group contribution methods, like UNIFAC, are limited due to feasibility of molecular fragmentation or availability of parameters, the GNN in Margules model offers an alternative for VLE estimation. The findings establish a baseline for the predictive accuracy that simple excess Gibbs energy models combined with GNNs trained solely on infinite dilution data can achieve. Keywords: graph neural networks, vapor-liquid equilibria, Margules, activity coefficients 1. Introduction Modeling vapor-liquid equilibria is essential for the development of most chemical processes. This is because many chemical processes operate under conditions where vapor and liquid phases interact. Although vapor-liquid Corresponding author Email address: sanchez@mpi-magdeburg.mpg.de


AutoML for Multi-Class Anomaly Compensation of Sensor Drift

arXiv.org Artificial Intelligence

Addressing sensor drift is essential in industrial measurement systems, where precise data output is necessary for maintaining accuracy and reliability in monitoring processes, as it progressively degrades the performance of machine learning models over time. Our findings indicate that the standard cross-validation method used in existing model training overestimates performance by inadequately accounting for drift. This is primarily because typical cross-validation techniques allow data instances to appear in both training and testing sets, thereby distorting the accuracy of the predictive evaluation. As a result, these models are unable to precisely predict future drift effects, compromising their ability to generalize and adapt to evolving data conditions. This paper presents two solutions: (1) a novel sensor drift compensation learning paradigm for validating models, and (2) automated machine learning (AutoML) techniques to enhance classification performance and compensate sensor drift. By employing strategies such as data balancing, meta-learning, automated ensemble learning, hyperparameter optimization, feature selection, and boosting, our AutoML-DC (Drift Compensation) model significantly improves classification performance against sensor drift. AutoML-DC further adapts effectively to varying drift severities.