Goto

Collaborating Authors

 Materials


Deep Learning to Automate Parameter Extraction and Model Fitting of Two-Dimensional Transistors

arXiv.org Artificial Intelligence

We present a deep learning approach to extract physical parameters (e.g., mobility, Schottky contact barrier height, defect profiles) of two-dimensional (2D) transistors from electrical measurements, enabling automated parameter extraction and technology computer-aided design (TCAD) fitting. To facilitate this task, we implement a simple data augmentation and pre-training approach by training a secondary neural network to approximate a physics-based device simulator. This method enables high-quality fits after training the neural network on electrical data generated from physics-based simulations of ~500 devices, a factor >40$\times$ fewer than other recent efforts. Consequently, fitting can be achieved by training on physically rigorous TCAD models, including complex geometry, self-consistent transport, and electrostatic effects, and is not limited to computationally inexpensive compact models. We apply our approach to reverse-engineer key parameters from experimental monolayer WS$_2$ transistors, achieving a median coefficient of determination ($R^2$) = 0.99 when fitting measured electrical data. We also demonstrate that this approach generalizes and scales well by reverse-engineering electrical data on high-electron-mobility transistors while fitting 35 parameters simultaneously. To facilitate future research on deep learning approaches for inverse transistor design, we have published our code and sample data sets online.


Modeling Membrane Degradation in PEM Electrolyzers with Physics-Informed Neural Networks

arXiv.org Artificial Intelligence

Proton exchange membrane (PEM) electrolyzers are pivotal for sustainable hydrogen production, yet their long-term performance is hindered by membrane degradation, which poses reliability and safety challenges. Therefore, accurate modeling of this degradation is essential for optimizing durability and performance. To address these concerns, traditional physics-based models have been developed, offering interpretability but requiring numerous parameters that are often difficult to measure and calibrate. Conversely, data-driven approaches, such as machine learning, offer flexibility but may lack physical consistency and generalizability. To address these limitations, this study presents the first application of Physics-Informed Neural Networks (PINNs) to model membrane degradation in PEM electrolyzers. The proposed PINN framework couples two ordinary differential equations, one modeling membrane thinning via a first-order degradation law and another governing the time evolution of the cell voltage under membrane degradation. Results demonstrate that the PINN accurately captures the long-term system's degradation dynamics while preserving physical interpretability with limited noisy data. Consequently, this work introduces a novel hybrid modeling approach for estimating and understanding membrane degradation mechanisms in PEM electrolyzers, offering a foundation for more robust predictive tools in electrochemical system diagnostics.


A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks

arXiv.org Artificial Intelligence

Recent research has demonstrated that state-of-the-art LLMs and defenses remain susceptible to multi-turn jailbreak attacks. These attacks require only closed-box model access and are often easy to perform manually, posing a significant threat to the safe and secure deployment of LLM-based systems. We study the effectiveness of the Crescendo multi-turn jailbreak at the level of intermediate model representations and find that safety-aligned LMs often represent Crescendo responses as more benign than harmful, especially as the number of conversation turns increases. Our analysis indicates that at each turn, Crescendo prompts tend to keep model outputs in a "benign" region of representation space, effectively tricking the model into fulfilling harmful requests. Further, our results help explain why single-turn jailbreak defenses like circuit breakers are generally ineffective against multi-turn attacks, motivating the development of mitigations that address this generalization gap.


Quantum-enhanced supercomputers are starting to do chemistry

New Scientist

A quantum computer and conventional supercomputer that work together could become an invaluable tool for understanding chemicals. A collaboration between IBM and the Japanese scientific institute RIKEN has now established one path to getting there. Predicting what a molecule will do within a reaction – for instance, as part of a medical treatment or an industrial catalyst – often hinges on understanding its electrons' quantum states. Quantum computers could accelerate the process of computing these states, but in their current form, they are still prone to errors. Conventional supercomputers can catch those mistakes before they become a problem. In a joint statement to New Scientist, Seiji Yunoki and Mitsuhisa Sato at RIKEN said quantum computers can push traditional computers to new capabilities.


A Scalable and Quantum-Accurate Foundation Model for Biomolecular Force Field via Linearly Tensorized Quadrangle Attention

arXiv.org Artificial Intelligence

Accurate atomistic biomolecular simulations are vital for disease mechanism understanding, drug discovery, and biomaterial design, but existing simulation methods exhibit significant limitations. Classical force fields are efficient but lack accuracy for transition states and fine conformational details critical in many chemical and biological processes. Quantum Mechanics (QM) methods are highly accurate but computationally infeasible for large-scale or long-time simulations. AI-based force fields (AIFFs) aim to achieve QM-level accuracy with efficiency but struggle to balance many-body modeling complexity, accuracy, and speed, often constrained by limited training data and insufficient validation for generalizability. To overcome these challenges, we introduce LiTEN, a novel equivariant neural network with Tensorized Quadrangle Attention (TQA). TQA efficiently models three- and four-body interactions with linear complexity by reparameterizing high-order tensor features via vector operations, avoiding costly spherical harmonics. Building on LiTEN, LiTEN-FF is a robust AIFF foundation model, pre-trained on the extensive nablaDFT dataset for broad chemical generalization and fine-tuned on SPICE for accurate solvated system simulations. LiTEN achieves state-of-the-art (SOTA) performance across most evaluation subsets of rMD17, MD22, and Chignolin, outperforming leading models such as MACE, NequIP, and EquiFormer. LiTEN-FF enables the most comprehensive suite of downstream biomolecular modeling tasks to date, including QM-level conformer searches, geometry optimization, and free energy surface construction, while offering 10x faster inference than MACE-OFF for large biomolecules (~1000 atoms). In summary, we present a physically grounded, highly efficient framework that advances complex biomolecular modeling, providing a versatile foundation for drug discovery and related applications.


Modular Soft Wearable Glove for Real-Time Gesture Recognition and Dynamic 3D Shape Reconstruction

arXiv.org Artificial Intelligence

With the increasing demand for human-computer interaction (HCI), flexible wearable gloves have emerged as a promising solution in virtual reality, medical rehabilitation, and industrial automation. However, the current technology still has problems like insufficient sensitivity and limited durability, which hinder its wide application. This paper presents a highly sensitive, modular, and flexible capacitive sensor based on line-shaped electrodes and liquid metal (EGaIn), integrated into a sensor module tailored to the human hand's anatomy. The proposed system independently captures bending information from each finger joint, while additional measurements between adjacent fingers enable the recording of subtle variations in inter-finger spacing. This design enables accurate gesture recognition and dynamic hand morphological reconstruction of complex movements using point clouds. Experimental results demonstrate that our classifier based on Convolution Neural Network (CNN) and Multilayer Perceptron (MLP) achieves an accuracy of 99.15% across 30 gestures. Meanwhile, a transformer-based Deep Neural Network (DNN) accurately reconstructs dynamic hand shapes with an Average Distance (AD) of 2.076\pm3.231 mm, with the reconstruction accuracy at individual key points surpassing SOTA benchmarks by 9.7% to 64.9%. The proposed glove shows excellent accuracy, robustness and scalability in gesture recognition and hand reconstruction, making it a promising solution for next-generation HCI systems.


Evaluating Pavement Deterioration Rates Due to Flooding Events Using Explainable AI

arXiv.org Artificial Intelligence

Flooding can damage pavement infrastructure significantly, causing both immediate and long-term structural and functional issues. This research investigates how flooding events affect pavement deterioration, specifically focusing on measuring pavement roughness by the International Roughness Index (IRI). To quantify these effects, we utilized 20 years of pavement condition data from TxDOT's PMIS database, which is integrated with flood event data, including duration and spatial extent. Statistical analyses were performed to compare IRI values before and after flooding and to calculate the deterioration rates influenced by flood exposure. Moreover, we applied Explainable Artificial Intelligence (XAI) techniques, such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), to assess the impact of flooding on pavement performance. The results demonstrate that flood-affected pavements experience a more rapid increase in roughness compared to non-flooded sections. These findings emphasize the need for proactive flood mitigation strategies, including improved drainage systems, flood-resistant materials, and preventative maintenance, to enhance pavement resilience in vulnerable regions.


Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach

arXiv.org Artificial Intelligence

Bias and stereotypes in language models can cause harm, especially in sensitive areas like content moderation and decision-making. This paper addresses bias and stereotype detection by exploring how jointly learning these tasks enhances model performance. We introduce StereoBias, a unique dataset labeled for bias and stereotype detection across five categories: religion, gender, socio-economic status, race, profession, and others, enabling a deeper study of their relationship. Our experiments compare encoder-only models and fine-tuned decoder-only models using QLoRA. While encoder-only models perform well, decoder-only models also show competitive results. Crucially, joint training on bias and stereotype detection significantly improves bias detection compared to training them separately. Additional experiments with sentiment analysis confirm that the improvements stem from the connection between bias and stereotypes, not multi-task learning alone. These findings highlight the value of leveraging stereotype information to build fairer and more effective AI systems.


Generative flow-based warm start of the variational quantum eigensolver

arXiv.org Machine Learning

Hybrid quantum-classical algorithms like the variational quantum eigensolver (VQE) show promise for quantum simulations on near-term quantum devices, but are often limited by complex objective functions and expensive optimization procedures. Here, we propose Flow-VQE, a generative framework leveraging conditional normalizing flows with parameterized quantum circuits to efficiently generate high-quality variational parameters. By embedding a generative model into the VQE optimization loop through preference-based training, Flow-VQE enables quantum gradient-free optimization and offers a systematic approach for parameter transfer, accelerating convergence across related problems through warm-started optimization. We compare Flow-VQE to a number of standard benchmarks through numerical simulations on molecular systems, including hydrogen chains, water, ammonia, and benzene. We find that Flow-VQE outperforms baseline optimization algorithms, achieving computational accuracy with fewer circuit evaluations (improvements range from modest to more than two orders of magnitude) and, when used to warm-start the optimization of new systems, accelerates subsequent fine-tuning by up to 50-fold compared with Hartree--Fock initialization. Therefore, we believe Flow-VQE can become a pragmatic and versatile paradigm for leveraging generative modeling to reduce the costs of variational quantum algorithms.


State and Memory is All You Need for Robust and Reliable AI Agents

arXiv.org Artificial Intelligence

Large language models (LLMs) have enabled powerful advances in natural language understanding and generation. Yet their application to complex, real-world scientific workflows remain limited by challenges in memory, planning, and tool integration. Here, we introduce SciBORG (Scientific Bespoke Artificial Intelligence Agents Optimized for Research Goals), a modular agentic framework that allows LLM-based agents to autonomously plan, reason, and achieve robust and reliable domain-specific task execution. Agents are constructed dynamically from source code documentation and augmented with finite-state automata (FSA) memory, enabling persistent state tracking and context-aware decision-making. This approach eliminates the need for manual prompt engineering and allows for robust, scalable deployment across diverse applications via maintaining context across extended workflows and to recover from tool or execution failures. We validate SciBORG through integration with both physical and virtual hardware, such as microwave synthesizers for executing user-specified reactions, with context-aware decision making and demonstrate its use in autonomous multi-step bioassay retrieval from the PubChem database utilizing multi-step planning, reasoning, agent-to-agent communication and coordination for execution of exploratory tasks. Systematic benchmarking shows that SciBORG agents achieve reliable execution, adaptive planning, and interpretable state transitions. Our results show that memory and state awareness are critical enablers of agentic planning and reliability, offering a generalizable foundation for deploying AI agents in complex environments.