Energy
FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation
Bellier, Georges Le, Audebert, Nicolas
The increasing availability of Earth observation data offers unprecedented opportunities for large-scale environmental monitoring and analysis. However, these datasets are inherently heterogeneous, stemming from diverse sensors, geographical regions, acquisition times, and atmospheric conditions. Distribution shifts between training and deployment domains severely limit the generalization of pretrained remote sensing models, making unsupervised domain adaptation (UDA) crucial for real-world applications. We introduce FlowEO, a novel framework that leverages generative models for image-space UDA in Earth observation. We leverage flow matching to learn a semantically preserving mapping that transports from the source to the target image distribution. This allows us to tackle challenging domain adaptation configurations for classification and semantic segmentation of Earth observation images. We conduct extensive experiments across four datasets covering adaptation scenarios such as SAR to optical translation and temporal and semantic shifts caused by natural disasters. Experimental results demonstrate that FlowEO outperforms existing image translation approaches for domain adaptation while achieving on-par or better perceptual image quality, highlighting the potential of flow-matching-based UDA for remote sensing.
AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
Xu, Tianling, Gan, Shengzhe, Gu, Leslie, Li, Yuelei, Zhan, Fangneng, Pfister, Hanspeter
Active 3D reconstruction enables an agent to autonomously select viewpoints to build accurate and complete scene geometry efficiently, rather than passively reconstructing scenes from pre-collected images. Existing active reconstruction methods often rely on geometric heuristics, which may result in redundant observations without improving reconstruction quality. T o address this, we propose AREA3D, an active reconstruction agent for 3D reconstruction by leveraging feed-forward 3D models and vision-language guidance. The framework decouples view uncertainty modeling from feed-forward reconstruction, enabling precise uncertainty estimation without online optimization. Moreover, the integrated Vision-Language Model provides high-level semantic guidance that guides exploration beyond purely geometric cues. Extensive experiments on both scene-level and object-level benchmarks demonstrate that AREA3D achieves state-of-the-art reconstruction accuracy, especially in sparse views.
BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training
Colombo, Luca, Pittorino, Fabrizio, Zambon, Daniele, Baldassi, Carlo, Roveri, Manuel, Alippi, Cesare
Binary Neural Networks (BNNs), which constrain both weights and activations to binary values, offer substantial reductions in computational complexity, memory footprint, and energy consumption. These advantages make them particularly well suited for deployment on resource-constrained devices. However, training BNNs via gradient-based optimization remains challenging due to the discrete nature of their variables. The dominant approach, quantization-aware training, circumvents this issue by employing surrogate gradients. Yet, this method requires maintaining latent full-precision parameters and performing the backward pass with floating-point arithmetic, thereby forfeiting the efficiency of binary operations during training. While alternative approaches based on local learning rules exist, they are unsuitable for global credit assignment and for back-propagating errors in multi-layer architectures. This paper introduces Binary Error Propagation (BEP), the first learning algorithm to establish a principled, discrete analog of the backpropagation chain rule. This mechanism enables error signals, represented as binary vectors, to be propagated backward through multiple layers of a neural network. BEP operates entirely on binary variables, with all forward and backward computations performed using only bitwise operations. Crucially, this makes BEP the first solution to enable end-to-end binary training for recurrent neural network architectures. We validate the effectiveness of BEP on both multi-layer perceptrons and recurrent neural networks, demonstrating gains of up to +6.89% and +10.57% in test accuracy, respectively. The proposed algorithm is released as an open-source repository.
Wasserstein Distributionally Robust Nash Equilibrium Seeking with Heterogeneous Data: A Lagrangian Approach
Wang, Zifan, Pantazis, Georgios, Grammatico, Sergio, Zavlanos, Michael M., Johansson, Karl H.
We study a class of distributionally robust games where agents are allowed to heterogeneously choose their risk aversion with respect to distributional shifts of the uncertainty. In our formulation, heterogeneous Wasserstein ball constraints on each distribution are enforced through a penalty function leveraging a Lagrangian formulation. We then formulate the distributionally robust game as a variational inequality problem, and show that under certain assumptions the original seemingly infinite-dimensional Nash equilibrium problem is equivalent to a multi-agent but finite-dimensional variational inequality problem with a strongly monotone mapping. Due to the inner maximization problem, it is however still challenging to calculate a distributionally robust Nash equilibrium. To this end, we design an approximate Nash equilibrium seeking algorithm and prove convergence of the average regret to a quantity that diminishes with the number of iterations, thus learning the desired equilibrium up to an a priori specified accuracy. Numerical simulations corroborate our theoretical findings.
MOSS: Efficient and Accurate FP8 LLM Training with Microscaling and Automatic Scaling
Zhang, Yu, Zhen, Hui-Ling, Yuan, Mingxuan, Yu, Bei
Training large language models with FP8 formats offers significant efficiency gains. However, the reduced numerical precision of FP8 poses challenges for stable and accurate training. Current frameworks preserve training performance using mixed-granularity quantization, i.e., applying per-group quantization for activations and per-tensor/block quantization for weights. While effective, per-group quantization requires scaling along the inner dimension of matrix multiplication, introducing additional dequantization overhead. Moreover, these frameworks often rely on just-in-time scaling to dynamically adjust scaling factors based on the current data distribution. However, this online quantization is inefficient for FP8 training, as it involves multiple memory reads and writes that negate the performance benefits of FP8. To overcome these limitations, we propose MOSS, a novel FP8 training framework that ensures both efficiency and numerical stability. MOSS introduces two key innovations: (1) a two-level microscaling strategy for quantizing sensitive activations, which balances precision and dequantization cost by combining a high-precision global scale with compact, power-of-two local scales; and (2) automatic scaling for weights in linear layers, which eliminates the need for costly max-reduction operations by predicting and adjusting scaling factors during training. Leveraging these techniques, MOSS enables efficient FP8 training of a 7B parameter model, achieving performance comparable to the BF16 baseline while achieving up to 34% higher training throughput. Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, including reasoning, language understanding, and generation (Achiam et al., 2023; Grattafiori et al., 2024; Liu et al., 2024; Adler et al., 2024).
SustainDiffusion: Optimising the Social and Environmental Sustainability of Stable Diffusion Models
d'Aloisio, Giordano, Fadahunsi, Tosin, Choy, Jay, Moussa, Rebecca, Sarro, Federica
Background: Text-to-image generation models are widely used across numerous domains. Among these models, Stable Diffusion (SD) - an open-source text-to-image generation model - has become the most popular, producing over 12 billion images annually. However, the widespread use of these models raises concerns regarding their social and environmental sustainability. Aims: To reduce the harm that SD models may have on society and the environment, we introduce SustainDiffusion, a search-based approach designed to enhance the social and environmental sustainability of SD models. Method: SustainDiffusion searches the optimal combination of hyperparameters and prompt structures that can reduce gender and ethnic bias in generated images while also lowering the energy consumption required for image generation. Importantly, SustainDiffusion maintains image quality comparable to that of the original SD model. Results: We conduct a comprehensive empirical evaluation of SustainDiffusion, testing it against six different baselines using 56 different prompts. Our results demonstrate that SustainDiffusion can reduce gender bias in SD3 by 68%, ethnic bias by 59%, and energy consumption (calculated as the sum of CPU and GPU energy) by 48%. Additionally, the outcomes produced by SustainDiffusion are consistent across multiple runs and can be generalised to various prompts. Conclusions: With SustainDiffusion, we demonstrate how enhancing the social and environmental sustainability of text-to-image generation models is possible without fine-tuning or changing the model's architecture.
Real-Time Execution of Action Chunking Flow Policies
Black, Kevin, Galliker, Manuel Y., Levine, Sergey
Modern AI systems, especially those interacting with the physical world, increasingly require real-time performance. However, the high latency of state-of-the-art generalist models, including recent vision-language action models (VLAs), poses a significant challenge. While action chunking has enabled temporal consistency in high-frequency control tasks, it does not fully address the latency problem, leading to pauses or out-of-distribution jerky movements at chunk boundaries. This paper presents a novel inference-time algorithm that enables smooth asynchronous execution of action chunking policies. Our method, real-time chunking (RTC), is applicable to any diffusion- or flow-based VLA out of the box with no re-training. It generates the next action chunk while executing the current one, "freezing" actions guaranteed to execute and "inpainting" the rest. To test RTC, we introduce a new benchmark of 12 highly dynamic tasks in the Kinetix simulator, as well as evaluate 6 challenging real-world bimanual manipulation tasks. Results demonstrate that RTC is fast, performant, and uniquely robust to inference delay, significantly improving task throughput and enabling high success rates in precise tasks $\unicode{x2013}$ such as lighting a match $\unicode{x2013}$ even in the presence of significant latency. See https://pi.website/research/real_time_chunking for videos.
GEX: Democratizing Dexterity with Fully-Actuated Dexterous Hand and Exoskeleton Glove
Dong, Yunlong, Liu, Xing, Wan, Jun, Deng, Zelin
Abstract--This paper introduces GEX, an innovative low-cost dexterous manipulation system that combines the GX11 tri-finger anthropomorphic hand (11 DoF) with the EX12 tri-finger exoskeleton glove (12 DoF), forming a closed-loop teleopera-tion framework through kinematic retargeting for high-fidelity control. Both components employ modular 3D-printed finger designs, achieving ultra-low manufacturing costs while maintaining full actuation capabilities. This full-actuation architecture enables precise bidirectional kinematic calculations, substantially enhancing kinematic retargeting fidelity between the exoskeleton and robotic hand. The proposed system bridges the cost-performance gap in dexterous manipulation research, providing an accessible platform for acquiring high-quality demonstration data to advance embodied AI and dexterous robotic skill transfer learning. Hand dexterity is fundamental to human cognition, enabling active manipulation, tool use, and the way we learn from our environment.
A Trustworthy By Design Classification Model for Building Energy Retrofit Decision Support
Rempi, Panagiota, Pelekis, Sotiris, Tzortzis, Alexandros Menelaos, Spiliotis, Evangelos, Karakolis, Evangelos, Ntanos, Christos, Askounis, Dimitris
Improving energy efficiency in residential buildings is critical to combating climate change and reducing greenhouse gas emissions. Retrofitting existing buildings, which contribute a significant share of energy use, is therefore a key priority, especially in regions with outdated building stock. Artificial Intelligence (AI) and Machine Learning (ML) can automate retrofit decision-making and find retrofit strategies. However, their use faces challenges of data availability, model transparency, and compliance with national and EU AI regulations including the AI act, ethics guidelines and the ALTAI. This paper presents a trustworthy-by-design ML-based decision support framework that recommends energy efficiency strategies for residential buildings using minimal user-accessible inputs. The framework merges Conditional Tabular Generative Adversarial Networks (CTGAN) to augment limited and imbalanced data with a neural network-based multi-label classifier that predicts potential combinations of retrofit actions. To support explanation and trustworthiness, an Explainable AI (XAI) layer using SHapley Additive exPlanations (SHAP) clarifies the rationale behind recommendations and guides feature engineering. Two case studies validate performance and generalization: the first leveraging a well-established, large EPC dataset for England and Wales; the second using a small, imbalanced post-retrofit dataset from Latvia (RETROFIT-LAT). Results show that the framework can handle diverse data conditions and improve performance up to 53% compared to the baseline. Overall, the proposed framework provides a feasible, interpretable, and trustworthy AI system for building retrofit decision support through assured performance, usability, and transparency to aid stakeholders in prioritizing effective energy investments and support regulation-compliant, data-driven innovation in sustainable energy transition.
Chernobyl radiation shield 'lost safety function' after drone strike, UN watchdog says
Chernobyl radiation shield'lost safety function' after drone strike, UN watchdog says A protective shield covering the Chernobyl nuclear reactor in Ukraine can no longer provide its main containment function following a drone strike earlier this year, according to a UN watchdog. International Atomic Energy Agency (IAEA) inspectors found that the massive structure, built over the site of the 1986 nuclear disaster, had lost its primary safety functions including the confinement capability. In February, Ukraine accused Russia of targeting the power plant - a claim the Kremlin denied. The IAEA said repairs were essential to prevent further degradation of the nuclear shelter. However environmental expert Jim Smith told the BBC: It is not something to panic about.