Energy
Bootstrapped Model Predictive Control
Wang, Yuhang, Guo, Hanwei, Wang, Sizhe, Qian, Long, Lan, Xuguang
Model Predictive Control (MPC) has been demonstrated to be effective in continuous control tasks. When a world model and a value function are available, planning a sequence of actions ahead of time leads to a better policy. Existing methods typically obtain the value function and the corresponding policy in a model-free manner. However, we find that such an approach struggles with complex tasks, resulting in poor policy learning and inaccurate value estimation. To address this problem, we leverage the strengths of MPC itself. In this work, we introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner. BMPC learns a network policy by imitating an MPC expert, and in turn, uses this policy to guide the MPC process. Combined with model-based TD-learning, our policy learning yields better value estimation and further boosts the efficiency of MPC. We also introduce a lazy reanalyze mechanism, which enables computationally efficient imitation learning. Our method achieves superior performance over prior works on diverse continuous control tasks. In particular, on challenging high-dimensional locomotion tasks, BMPC significantly improves data efficiency while also enhancing asymptotic performance and training stability, with comparable training time and smaller network sizes. Code is available at https://github.com/wertyuilife2/bmpc.
Optimal Invariant Bases for Atomistic Machine Learning
Allen, Alice E. A., Shinkle, Emily, Bujack, Roxana, Lubbers, Nicholas
The representation of atomic configurations for machine learning models has led to the development of numerous descriptors, often to describe the local environment of atoms. However, many of these representations are incomplete and/or functionally dependent. Incomplete descriptor sets are unable to represent all meaningful changes in the atomic environment. Complete constructions of atomic environment descriptors, on the other hand, often suffer from a high degree of functional dependence, where some descriptors can be written as functions of the others. These redundant descriptors do not provide additional power to discriminate between different atomic environments and increase the computational burden. By employing techniques from the pattern recognition literature to existing atomistic representations, we remove descriptors that are functions of other descriptors to produce the smallest possible set that satisfies completeness. We apply this in two ways: first we refine an existing description, the Atomistic Cluster Expansion. We show that this yields a more efficient subset of descriptors. Second, we augment an incomplete construction based on a scalar neural network, yielding a new message-passing network architecture that can recognize up to 5-body patterns in each neuron by taking advantage of an optimal set of Cartesian tensor invariants. This architecture shows strong accuracy on state-of-the-art benchmarks while retaining low computational cost. Our results not only yield improved models, but point the way to classes of invariant bases that minimize cost while maximizing expressivity for a host of applications.
Why risk matters for protein binder design
Cotet, Tudor-Stefan, Krawczuk, Igor
Bayesian optimization (BO) has recently become more prevalent in protein engineering applications and hence has become a fruitful target of benchmarks. However, current BO comparisons often overlook real-world considerations like risk and cost constraints. In this work, we compare 72 model combinations of encodings, surrogate models, and acquisition functions on 11 protein binder fitness landscapes, specifically from this perspective. Drawing from the portfolio optimization literature, we adopt metrics to quantify the cold-start performance relative to a random baseline, to assess the risk of an optimization campaign, and to calculate the overall budget required to reach a fitness threshold. Our results suggest the existence of Pareto-optimal models on the risk-performance axis, the shift of this preference depending on the landscape explored, and the robust correlation between landscape properties such as epistasis with the average and worst-case model performance. They also highlight that rigorous model selection requires substantial computational and statistical efforts.
Sustainable broadcasting in Blockchain Networks with Reinforcement Learning
Valko, Danila, Kudenko, Daniel
Recent estimates put the carbon footprint of Bitcoin and Ethereum at an average of 64 and 26 million tonnes of CO2 per year, respectively. To address this growing problem, several possible approaches have been proposed in the literature: creating alternative blockchain consensus mechanisms, applying redundancy reduction techniques, utilizing renewable energy sources, and employing energy-efficient devices, etc. In this paper, we follow the second avenue and propose an efficient approach based on reinforcement learning that improves the block broadcasting scheme in blockchain networks. The analysis and experimental results confirmed that the proposed improvement of the block propagation scheme could cleverly handle network dynamics and achieve better results than the default approach. Additionally, our technical integration of the simulator and developed RL environment can be used as a complete solution for further study of new schemes and protocols that use RL or other ML techniques.
Proper scoring rules for estimation and forecast evaluation
Waghmare, Kartik, Ziegel, Johanna
In recent years, proper scoring rules have emerged as a power ful general approach for estimating probability distributions. In addition to significantly ex panding the range of modeling techniques that can be applied in practice, this has also substantially broadened the conceptual understanding of estimation methods. Originally, proper scoring rules we re conceived in meteorology as summary statistics for describing the performance of probabilisti c forecasts ( Murphy and Winkler, 1984), but they also play an important role in economics as tools for bel ief elicitation ( Schotter and Trevino, 2014). A probabilistic forecast is a probability distribution ove r the space of the possible outcomes of the future event that is stated by the forecaster. The simple st and most popular case of probabilistic forecasts arises when the outcome is binary, so the probabilistic forecast reduces to issuing a predictive probability of success. Brier ( 1950) was the first to consider the problem of devising a scoring rule which could not be "played" by a dishonest fore casting agent. He introduced the quadratic scoring rule and showed that it incentivizes a for ecasting agent to state his most accurate probability estimate when faced with uncertainty.
A Unified Approach to Analysis and Design of Denoising Markov Models
Ren, Yinuo, Rotskoff, Grant M., Ying, Lexing
Probabilistic generative models based on measure transport, such as diffusion and flow-based models, are often formulated in the language of Markovian stochastic dynamics, where the choice of the underlying process impacts both algorithmic design choices and theoretical analysis. In this paper, we aim to establish a rigorous mathematical foundation for denoising Markov models, a broad class of generative models that postulate a forward process transitioning from the target distribution to a simple, easy-to-sample distribution, alongside a backward process particularly constructed to enable efficient sampling in the reverse direction. Leveraging deep connections with nonequilibrium statistical mechanics and generalized Doob's $h$-transform, we propose a minimal set of assumptions that ensure: (1) explicit construction of the backward generator, (2) a unified variational objective directly minimizing the measure transport discrepancy, and (3) adaptations of the classical score-matching approach across diverse dynamics. Our framework unifies existing formulations of continuous and discrete diffusion models, identifies the most general form of denoising Markov models under certain regularity assumptions on forward generators, and provides a systematic recipe for designing denoising Markov models driven by arbitrary L\'evy-type processes. We illustrate the versatility and practical effectiveness of our approach through novel denoising Markov models employing geometric Brownian motion and jump processes as forward dynamics, highlighting the framework's potential flexibility and capability in modeling complex distributions.
Last Chance: 109 Best Amazon Spring Sale Deals for March 2025
Prime Day is months away. Black Friday is nearly a year off. Amazon has spied a gap in the calendar and plans to cram it full of deals. Amazon's Big Spring Sale kicked off on March 25 and ends today, March 31. With no other big sale events in view, this could be a good time to snag that mesh router, set of headphones, or robot vac you've had your eye on. As usual, Amazon has discounts on all sorts of stuff, but many deals are exclusive to Amazon Prime members. We're not suggesting you harvest this spring deal crop indiscriminately; we're here to help you sort the wheat from the chaff. The WIRED Gear team has run its many eyes over the list to tease out deals for gadgets worth owning and actual deals. Everything we highlight here has been hand-tested by one of us and deemed worthy of a spot in your home. Updated March 31: We added a few fresh deals, including a portable power station, USB flash drive, and fitness tracker, removed expired deals, and checked the prices. Get best-in-class reporting that's too important to ignore for just 2.50 1 per month for 1 year. Includes unlimited digital access and exclusive subscriber-only content. The Eero Pro 6E (7/10, WIRED Recommends) mesh system is one of the easiest to set up and will deliver speedy, stable Wi-Fi across your home. Amazon's Eero makes some of our favorite mesh systems, ideal for busy families seeking a set-and-forget mesh. The Pro 6E is a tri-band system with a 6-GHz band for fast Wi-Fi at close range, and with the jump to Wi-Fi 7 systems still costly, this system is worth considering right now. But you need an Eero Plus subscription at 10 per month or 100 per year to unlock the best features, including parental controls, advanced security, and ad blocking. There are discounts on other Eero systems, so check our Eero buying guide to decide which is best for your home. DJI's debut portable power station can put out 2,200 watts steadily (2,600 watts surge), has two USB-C PD 3.1 ports (140 watts), and boasts DJI's proprietary SDC ports for fast-charging drone batteries. It can juice up phones, run microwaves or small tools, and meet most of your portable power needs, but it's an especially great choice for folks with DJI drones because it can fast-charge most models. It gets a little noisy with several gadgets charging, and cable and bag accessories cost extra, but it still claims a place in our best portable power stations guide. This is one of the best portable power stations for camping or road trips because it's a manageable size. EcoFlow's River 2 Pro has a LiFeP04 battery inside, which is good for 768 watt-hours.
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
We introduce Entropy-Guided Sequence Weighting (EGSW), a novel approach that enhances the exploration-exploitation tradeoff by dynamically assigning weights to generated outputs based on their advantage and entropy for Reinforcement Learning-based Large Language Model fine-tuning. EGSW integrates entropy regularization with advantage-based weighting to balance policy updates, enabling efficient exploration in high-dimensional state spaces. By employing temperature-scaled softmax weighting over sequences, EGSW prioritizing high-reward, high-uncertainty steps while maintaining training stability. Although originally developed to improve Group Relative Policy Optimization (GRPO) during large language model (LLM) fine-tuning, EGSW is generalizable to other reinforcement learning (RL) algorithms and can be implemented in both step-wise and trajectory-wise settings. Empirical evaluations demonstrate that EGSW enhances GRPO reasoning ability, yielding improvements in sample efficiency. Future work will explore the application of EGSW to advanced RL methodologies.
Move fast, kill things: the tech startups trying to reinvent defence with Silicon Valley values
Visit tech startup Skydio's headquarters on the San Francisco peninsula in California and you're likely to find flying robots buzzing on the roof overhead. Docking stations with motorised covers open to allow small drones that resemble the TIE fighters from Star Wars films to take off; when each drone lands back again, they close. The drones can fly completely autonomously and without GPS, taking in data from onboard cameras and using AI to execute programmed missions and avoid obstacles. Skydio, with more than 740m in venture capital funding and a valuation of about 2.5bn, makes drones for the military along with civilian organisations such as police forces and utility companies. The company moved away from the consumer market in 2020 and is now the largest US drone maker.
Concorde: Fast and Accurate CPU Performance Modeling with Compositional Analytical-ML Fusion
Nasr-Esfahany, Arash, Alizadeh, Mohammad, Lee, Victor, Alam, Hanna, Coon, Brett W., Culler, David, Dadu, Vidushi, Dixon, Martin, Levy, Henry M., Pandey, Santosh, Ranganathan, Parthasarathy, Yazdanbakhsh, Amir
Cycle-level simulators such as gem5 are widely used in microarchitecture design, but they are prohibitively slow for large-scale design space explorations. We present Concorde, a new methodology for learning fast and accurate performance models of microarchitectures. Unlike existing simulators and learning approaches that emulate each instruction, Concorde predicts the behavior of a program based on compact performance distributions that capture the impact of different microarchitectural components. It derives these performance distributions using simple analytical models that estimate bounds on performance induced by each microarchitectural component, providing a simple yet rich representation of a program's performance characteristics across a large space of microarchitectural parameters. Experiments show that Concorde is more than five orders of magnitude faster than a reference cycle-level simulator, with about 2% average Cycles-Per-Instruction (CPI) prediction error across a range of SPEC, open-source, and proprietary benchmarks. This enables rapid design-space exploration and performance sensitivity analyses that are currently infeasible, e.g., in about an hour, we conducted a first-of-its-kind fine-grained performance attribution to different microarchitectural components across a diverse set of programs, requiring nearly 150 million CPI evaluations.