Goto

Collaborating Authors

 Energy


The Download: aging clocks, and repairing the internet

MIT Technology Review

Plus: California's AI safety bill has passed into law Wrinkles and gray hairs aside, it can be difficult to know how well--or poorly--someone's body is truly aging. A person who develops age-related diseases earlier in life, or has other biological changes associated with aging, might be considered "biologically older" than a similar-age person who doesn't have those changes. Some 80-year-olds will be weak and frail, while others are fit and active. Over the past decade, scientists have been uncovering new methods of looking at the hidden ways our bodies are aging. And what they've found is changing our understanding of aging itself. Can we repair the internet?


Generative artificial intelligence improves projections of climate extremes

arXiv.org Artificial Intelligence

Climate change is amplifying extreme weather and climate events worldwide [1]. Anthropogenic greenhouse gas emissions have disrupted the Earth's climate system, driving more frequent and severe heatwaves [2], cold spells [3], heavy precipitation [4], agricultural droughts [5], and tropical cyclones (TCs) [6]. Between 2016 and 2024, daily land temperature records show that extreme heat events occurred over four times more often than expected, while cold records declined by half [7]. These unprecedented shifts threaten human health [8, 9], infrastructure [10, 11], food security [12], biodiversity [13], and global economies [14, 15]. Therefore, reliable climate projections are essential for effective mitigation and adaptation strategies [16-18]. The Coupled Model Intercomparison Project (CMIP) [19] provides a foundation for global climate projections. Since its launch in 1995, CMIP has coordinated systematic evaluation of coupled general circulation models (GCMs). CMIP5 introduced Representative Concentration Pathways (RCPs), while CMIP6 extended this framework by incorporating Shared Socioeconomic Pathways (SSPs) through ScenarioMIP, enabling consistent simulations of emissions and socioeconomic trajectories to 2100 and facilitating integrated assessment of climate risks [20]. These advances have greatly enhanced the scientific and policy relevance of climate projections.


Load Balancing Mixture of Experts with Similarity Preserving Routers

arXiv.org Artificial Intelligence

Sparse Mixture of Experts (MoE) models offer a scalable and efficient architecture for training large neural networks by activating only a subset of parameters ("experts") for each input. A learned router computes a distribution over these experts, and assigns input tokens to a small subset. However, without auxiliary balancing mechanisms, routers often converge to using only a few experts, severely limiting model capacity and degrading performance. Most current load balancing mechanisms encourage a distribution over experts that resembles a roughly uniform distribution of experts per token. During training, this can result in inconsistent routing behavior, resulting in the model spending its capacity to learn redundant knowledge. We address this by introducing a novel load balancing loss that preserves token-wise relational structure, encouraging consistent expert choices for similar inputs during training. Our experimental results show that applying our loss to the router results in 36% faster convergence and lower redundancy compared to a popular load balancing loss.


mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

arXiv.org Artificial Intelligence

Despite their ability to understand chemical knowledge, large language models (LLMs) remain limited in their capacity to propose novel molecules with desired functions (e.g., drug-like properties). In addition, the molecules that LLMs propose can often be challenging to make, and are almost never compatible with automated synthesis approaches. To better enable the discovery of functional small molecules, LLMs need to learn a new molecular language that is more effective in predicting properties and inherently synced with automated synthesis technology. Current molecule LLMs are limited by representing molecules based on atoms. In this paper, we argue that just like tokenizing texts into meaning-bearing (sub-)word tokens instead of characters, molecules should be tokenized at the level of functional building blocks, i.e., parts of molecules that bring unique functions and serve as effective building blocks for real-world automated laboratory synthesis. This motivates us to propose mCLM, a modular Chemical-Language Model that comprises a bilingual language model that understands both natural language descriptions of functions and molecular blocks. mCLM front-loads synthesizability considerations while improving the predicted functions of molecules in a principled manner. mCLM, with only 3B parameters, achieves improvements in synthetic accessibility relative to 7 other leading generative AI methods including GPT-5. When tested on 122 out-of-distribution medicines using only building blocks/tokens that are compatible with automated modular synthesis, mCLM outperforms all baselines in property scores and synthetic accessibility. mCLM can also reason on multiple functions and iteratively self-improve to rescue drug candidates that failed late in clinical trials ("fallen angels").


QAMA: Scalable Quantum Annealing Multi-Head Attention Operator for Deep Learning

arXiv.org Artificial Intelligence

Attention mechanisms underpin modern deep learning, while the quadratic time and space complexity limit scalability for long sequences. To address this, Quantum Annealing Multi-Head Attention (QAMA) is proposed, a novel drop-in operator that reformulates attention as an energy-based Hamiltonian optimization problem. In this framework, token interactions are encoded into binary quadratic terms, and quantum annealing is employed to search for low-energy configurations that correspond to effective attention patterns. Unlike classical sparse or approximate attention methods that rely on hand-crafted heuristics, QAMA allows sparsity structures to emerge naturally from the optimization process. Theoretically, computational complexity is analysed through single-spin flip dynamics, providing time to solution runtime bounds that depend on the spectral properties of the annealing Hamiltonian. Empirically, evaluation on both natural language and vision benchmarks shows that, across tasks, accuracy deviates by at most 2.7 points from standard multi-head attention, while requiring only linear qubits in sequence length. Visualizations further reveal that the Hamiltonian penalty terms induce meaningful and interpretable sparsity across heads. Finally, deployment on a coherent Ising machine validates the feasibility of running QAMA on real quantum hardware, showing tangible inference-time reductions compared with classical implementations. These results highlight QAMA as a pioneering and scalable step toward integrating quantum optimization devices into deep neural architectures, providing a seamlessly integrable and hardware-compatible alternative to conventional attention mechanisms. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.


DiffStyleTS: Diffusion Model for Style Transfer in Time Series

arXiv.org Artificial Intelligence

Style transfer combines the content of one signal with the style of another. It supports applications such as data augmentation and scenario simulation, helping machine learning models generalize in data-scarce domains. While well developed in vision and language, style transfer methods for time series data remain limited. We introduce DiffTSST, a diffusion-based framework that disentangles a time series into content and style representations via convolutional encoders and recombines them through a self-supervised attention-based diffusion process. At inference, encoders extract content and style from two distinct series, enabling conditional generation of novel samples to achieve style transfer. We demonstrate both qualitatively and quantitatively that DiffTSST achieves effective style transfer. We further validate its real-world utility by showing that data augmentation with DiffTSST improves anomaly detection in data-scarce regimes.


Adap-RPF: Adaptive Trajectory Sampling for Robot Person Following in Dynamic Crowded Environments

arXiv.org Artificial Intelligence

Abstract-- Robot person following (RPF) is a core capability in human-robot interaction, enabling robots to assist users in daily activities, collaborative work, and other service scenarios. However, achieving practical RPF remains challenging due to frequent occlusions, particularly in dynamic and crowded environments. Existing approaches often rely on fixed-point following or sparse candidate-point selection with oversimplified heuristics, which cannot adequately handle complex occlusions caused by moving obstacles such as pedestrians. T o address these limitations, we propose an adaptive trajectory sampling method that generates dense candidate points within socially aware zones and evaluates them using a multi-objective cost function. Based on the optimal point, a person-following trajectory is estimated relative to the predicted motion of the target. We further design a prediction-aware model predictive path integral (MPPI) controller that simultaneously tracks this trajectory and proactively avoids collisions using predicted pedestrian motions. Extensive experiments show that our method outperforms state-of-the-art baselines in smoothness, safety, robustness, and human comfort, with its effectiveness further demonstrated on a mobile robot in real-world scenarios. I. INTRODUCTION Robot person following (RPF) is a fundamental capability in human-robot interaction (HRI), enabling a wide range of applications [1], [2], [3].


Gym-TORAX: Open-source software for integrating RL with plasma control simulators

arXiv.org Artificial Intelligence

This paper presents Gym-TORAX, a Python package enabling the implementation of Reinforcement Learning (RL) environments for simulating plasma dynamics and control in tokamaks. Users define succinctly a set of control actions and observations, and a control objective from which Gym-TORAX creates a Gymnasium environment that wraps TORAX for simulating the plasma dynamics. The objective is formulated through rewards depending on the simulated state of the plasma and control action to optimize specific characteristics of the plasma, such as performance and stability. The resulting environment instance is then compatible with a wide range of RL algorithms and libraries and will facilitate RL research in plasma control. In its current version, one environment is readily available, based on a ramp-up scenario of the International Thermonuclear Experimental Reactor (ITER).


DemoHLM: From One Demonstration to Generalizable Humanoid Loco-Manipulation

arXiv.org Artificial Intelligence

Loco-manipulation is a fundamental challenge for humanoid robots to achieve versatile interactions in human environments. Although recent studies have made significant progress in humanoid whole-body control, loco-manipulation remains underexplored and often relies on hard-coded task definitions or costly real-world data collection, which limits autonomy and generalization. We present DemoHLM, a framework for humanoid loco-manipulation that enables generalizable loco-manipulation on a real humanoid robot from a single demonstration in simulation. DemoHLM adopts a hierarchy that integrates a low-level universal whole-body controller with high-level manipulation policies for multiple tasks. The whole-body controller maps whole-body motion commands to joint torques and provides omnidirectional mobility for the humanoid robot. The manipulation policies, learned in simulation via our data generation and imitation learning pipeline, command the whole-body controller with closed-loop visual feedback to execute challenging loco-manipulation tasks. Experiments show a positive correlation between the amount of synthetic data and policy performance, underscoring the effectiveness of our data generation pipeline and the data efficiency of our approach. Real-world experiments on a Unitree G1 robot equipped with an RGB-D camera validate the sim-to-real transferability of DemoHLM, demonstrating robust performance under spatial variations across ten loco-manipulation tasks.


Enforcing convex constraints in Graph Neural Networks

arXiv.org Artificial Intelligence

Many machine learning applications require outputs that satisfy complex, dynamic constraints. This task is particularly challenging in Graph Neural Network models due to the variable output sizes of graph-structured data. In this paper, we introduce ProjNet, a Graph Neural Network framework which satisfies input-dependant constraints. ProjNet combines a sparse vector clipping method with the Component-Averaged Dykstra (CAD) algorithm, an iterative scheme for solving the best-approximation problem. We establish a convergence result for CAD and develop a GPU-accelerated implementation capable of handling large-scale inputs efficiently. To enable end-to-end training, we introduce a surrogate gradient for CAD that is both computationally efficient and better suited for optimization than the exact gradient. We validate ProjNet on four classes of constrained optimisation problems: linear programming, two classes of non-convex quadratic programs, and radio transmit power optimization, demonstrating its effectiveness across diverse problem settings.