Goto

Collaborating Authors

 Energy


How Millie Dresselhaus paid it forward

MIT Technology Review

Encouraged early on by Nobel laureate Enrico Fermi, the "Queen of Carbon" laid the foundation for countless advances in nanotechnology--and mentored countless young scientists along the way. At MIT, Mildred Dresselhaus became a beloved professor who pushed her students to be their very best and provided support in ways big and small. Institute Professor Mildred "Millie" Dresselhaus forever altered our understanding of matter--the physical stuff of the universe that has mass and takes up space. Over 57 years at MIT, Dresselhaus also played a significant role in inspiring people to use this new knowledge to tackle some of the world's greatest challenges, from producing clean energy to curing cancer. Although she became an emerita professor in 2007, Dresselhaus, who taught electrical engineering and physics, remained actively involved in research and all other aspects of MIT life until her death in 2017. She would have been 95 this November.


Ukrainian city in total blackout after 'massive' Russian assault

BBC News

The Ukrainian city of Chernihiv is in total blackout following what the authorities describe as a massive assault by Russian missiles and drones, with hundreds of thousands of people affected. Across the wider Chernihiv region, four people are reported to have been killed as residential neighbourhoods were struck in the town of Novhorod-Siverskyi. Ten others were injured, including a 10-year-old girl. The country's most northerly region is the latest to be hit in an intensifying series of attacks on civilian infrastructure as Russia targets energy supplies, the rail network, homes and businesses in its full-scale invasion of Ukraine. I personally heard the drones flying overhead, 55-year-old Oleksandr Babich said.


How Russia's new tactics pose new winter threat to Ukraine

Al Jazeera

How successful is Ukraine's'gas war' against Russia? How will Putin travel to Hungary with an ICC arrest warrant? How much of Europe's oil still comes from Russia? How Russia's new tactics pose new winter threat to Ukraine The Russian drone strike was surgically precise and destroyed a giant transformer at a key power station in the Ukrainian capital. "There's nothing left to repair," Mykola Svyrydenko, who lives close to Thermal Station 5, a sprawling, Soviet-era structure with two giant steam pipes that provides electricity and heat to hundreds of thousands of Kyiv's residents, told Al Jazeera.


Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics

arXiv.org Artificial Intelligence

Transformers are the de-facto choice for sequence modelling, yet their quadratic self-attention and weak temporal bias can make long-range forecasting both expensive and brittle. We introduce FreezeTST, a lightweight hybrid that interleaves frozen random-feature (reservoir) blocks with standard trainable Transformer layers. The frozen blocks endow the network with rich nonlinear memory at no optimisation cost; the trainable layers learn to query this memory through self-attention. The design cuts trainable parameters and also lowers wall-clock training time, while leaving inference complexity unchanged. On seven standard long-term forecasting benchmarks, FreezeTST consistently matches or surpasses specialised variants such as Informer, Autoformer, and PatchTST; with substantially lower compute. Our results show that embedding reservoir principles within Transformers offers a simple, principled route to efficient long-term time-series prediction.


Decision-focused Sensing and Forecasting for Adaptive and Rapid Flood Response: An Implicit Learning Approach

arXiv.org Artificial Intelligence

Timely and reliable decision-making is vital for flood emergency response, yet it remains severely hindered by limited and imprecise situational awareness due to various budget and data accessibility constraints. Traditional flood management systems often rely on in-situ sensors to calibrate remote sensing-based large-scale flood depth forecasting models, and further take flood depth estimates to optimize flood response decisions. However, these approaches often take fixed, decision task-agnostic strategies to decide where to put in-situ sensors (e.g., maximize overall information gain) and train flood forecasting models (e.g., minimize average forecasting errors), but overlook that systems with the same sensing gain and average forecasting errors may lead to distinct decisions. To address this, we introduce a novel decision-focused framework that strategically selects locations for in-situ sensor placement and optimize spatio-temporal flood forecasting models to optimize downstream flood response decision regrets. Our end-to-end pipeline integrates four components: a contextual scoring network, a differentiable sensor selection module under hard budget constraints, a spatio-temporal flood reconstruction and forecasting model, and a differentiable decision layer tailored to task-specific objectives. Central to our approach is the incorporation of Implicit Maximum Likelihood Estimation (I-MLE) to enable gradient-based learning over discrete sensor configurations, and probabilistic decision heads to enable differentiable approximation to various constrained disaster response tasks.


Accelerating Frontier MoE Training with 3D Integrated Optics

arXiv.org Artificial Intelligence

--The unabated growth in AI workload demands is driving the need for concerted advances in compute, memory, and interconnect performance. As traditional semiconductor scaling slows, high-speed interconnects have emerged as the new scaling engine, enabling the creation of larger logical GPUs by linking many GPUs into a single, low-latency, high-bandwidth compute domain. While initial scale-up fabrics leveraged copper interconnects for their power and cost advantages, the maximum reach of passive electrical interconnects (approximately 1 meter) effectively limits the scale-up domain to within a single rack. The advent of 3D-stacked optics and logic offers a transformative, power-efficient scale-up solution for connecting hundreds of GPU packages (thousands of GPUs) across multiple data center racks. This work explores the design tradeoffs of scale-up technologies and demonstrates how frontier LLMs necessitate novel photonic solutions to achieve aggressive power and performance targets. We model the benefits of 3D CPO (Passage) enabled GPUs and switches within the scale-up domain when training Frontier Mixture of Experts (MoE) models exceeding one trillion parameters. Our results show that the substantial increases in bandwidth and radix enabled by 3D CPO allow for an 8X increase in scale-up capability. The race to build larger, more sophisticated AI models is pushing the limits of existing infrastructure. At the chip and package level, GPUs are constrained by shoreline, yields and power. These challenges have led to the development of large high-bandwidth, low-latency scale-up pods. These pods effectively combine hundreds of GPUs into a single logical GPU to facilitate a variety of parallelism strategies (e.g. Approaches like Mixture of Experts (MoE) [1] have pushed scale-up networks to their limits due to copper reach (1 meter), which constrains the number of GPUs that can be connected within a single network hop. With MoEs, an ensemble of specialized sub-networks work together through sparse activations to increase model capacity without significantly increasing computational requirements. The output of the selected experts are combined to create the final result.


CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

arXiv.org Artificial Intelligence

Reinforcement learning (RL), while powerful and expressive, can often prioritize performance at the expense of safety. Yet safety violations can lead to catastrophic outcomes in real-world deployments. Control Barrier Functions (CBFs) offer a principled method to enforce dynamic safety -- traditionally deployed online via safety filters. While the result is safe behavior, the fact that the RL policy does not have knowledge of the CBF can lead to conservative behaviors. This paper proposes CBF-RL, a framework for generating safe behaviors with RL by enforcing CBFs in training. CBF-RL has two key attributes: (1) minimally modifying a nominal RL policy to encode safety constraints via a CBF term, (2) and safety filtering of the policy rollouts in training. Theoretically, we prove that continuous-time safety filters can be deployed via closed-form expressions on discrete-time roll-outs. Practically, we demonstrate that CBF-RL internalizes the safety constraints in the learned policy -- both enforcing safer actions and biasing towards safer rewards -- enabling safe deployment without the need for an online safety filter. We validate our framework through ablation studies on navigation tasks and on the Unitree G1 humanoid robot, where CBF-RL enables safer exploration, faster convergence, and robust performance under uncertainty, enabling the humanoid robot to avoid obstacles and climb stairs safely in real-world settings without a runtime safety filter.


Guided Multi-Fidelity Bayesian Optimization for Data-driven Controller Tuning with Digital Twins

arXiv.org Artificial Intelligence

We propose a \textit{guided multi-fidelity Bayesian optimization} framework for data-efficient controller tuning that integrates corrected digital twin simulations with real-world measurements. The method targets closed-loop systems with limited-fidelity simulations or inexpensive approximations. To address model mismatch, we build a multi-fidelity surrogate with a learned correction model that refines digital twin estimates using real data. An adaptive cost-aware acquisition function balances expected improvement, fidelity, and sampling cost. Our method ensures adaptability as new measurements arrive. The digital twin accuracy is re-estimated, dynamically adapting both cross-source correlations and the acquisition function. This ensures that accurate simulations are used more frequently, while inaccurate simulation data are appropriately downweighted. Experiments on robotic drive hardware and supporting numerical studies demonstrate that our method enhances tuning efficiency compared to standard Bayesian optimization and multi-fidelity methods.


SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly evaluated on single-answer multiple-choice tasks, yet many real-world problems require identifying all correct answers from a set of options. This capability remains underexplored. We introduce SATA-BENCH, the first dedicated benchmark for evaluating LLMs on Select All That Apply (SATA) questions across diverse domains, including reading comprehension, law, and biomedicine. Our evaluation of 27 open-source and proprietary models reveals a significant gap: even the strongest model achieves only 41.8% exact match, exposing LLMs' inability to reliably identify all correct answers. We find that this weakness stems from two core challenges: selection bias - models favor certain choices regardless of content, and count bias - models fail to predict the correct number of answers. To address these issues, we propose Choice Funnel, a decoding strategy that combines token debiasing with adaptive thresholding to guide models toward complete and accurate selections. Choice Funnel achieves up to 29% higher exact match than competitive baselines while reducing inference cost by over 64%. Our findings expose fundamental limitations in current LLMs and introduce a new framework for diagnosing and improving multi-answer reasoning. We release SATA-BENCH and Choice Funnel to promote LLM development for robust decision-making in realistic, multi-answer applications.


A Multi-Threading Kernel for Enabling Neuromorphic Edge Applications

arXiv.org Artificial Intelligence

Abstract--Spiking Neural Networks (SNNs) have sparse, event-driven processing that can leverage neuromorphic applications. In this work, we introduce a multi-threading kernel that enables neuromorphic applications running at the edge, meaning they process sensory input directly and without any up-link to or dependency on a cloud service. The kernel shows speed-up gains over single thread processing by a factor of four on moderately sized SNNs and 1.7X on a Synfire network. Furthermore, it load-balances all cores available on multi-core processors, such as ARM, which run today's mobile devices and is up to 70% more energy efficient compared to statical core assignment. The present work can enable the development of edge applications that have low Size, Weight, and Power (SWaP), and can prototype the integration of neuromorphic chips.