Goto

Collaborating Authors

 Energy


Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning

arXiv.org Artificial Intelligence

Imitation learning has proven effective for training robots to perform complex tasks from expert human demonstrations. However, it remains limited by its reliance on high-quality, task-specific data, restricting adaptability to the diverse range of real-world object configurations and scenarios. In contrast, non-expert data -- such as play data, suboptimal demonstrations, partial task completions, or rollouts from suboptimal policies -- can offer broader coverage and lower collection costs. However, conventional imitation learning approaches fail to utilize this data effectively. To address these challenges, we posit that with right design decisions, offline reinforcement learning can be used as a tool to harness non-expert data to enhance the performance of imitation learning policies. We show that while standard offline RL approaches can be ineffective at actually leveraging non-expert data under the sparse data coverage settings typically encountered in the real world, simple algorithmic modifications can allow for the utilization of this data, without significant additional assumptions. Our approach shows that broadening the support of the policy distribution can allow imitation algorithms augmented by offline RL to solve tasks robustly, showing considerably enhanced recovery and generalization behavior. In manipulation tasks, these innovations significantly increase the range of initial conditions where learned policies are successful when non-expert data is incorporated. Moreover, we show that these methods are able to leverage all collected data, including partial or suboptimal demonstrations, to bolster task-directed policy performance. This underscores the importance of algorithmic techniques for using non-expert data for robust policy learning in robotics. Website: https://uwrobotlearning.github.io/RISE-offline/


BUILDA: A Thermal Building Data Generation Framework for Transfer Learning

arXiv.org Artificial Intelligence

Transfer learning (TL) can improve data-driven modeling of building thermal dynamics. Therefore, many new TL research areas emerge in the field, such as selecting the right source model for TL. However, these research directions require massive amounts of thermal building data which is lacking presently. Neither public datasets nor existing data generators meet the needs of TL research in terms of data quality and quantity. Moreover, existing data generation approaches typically require expert knowledge in building simulation. We present BuilDa, a thermal building data generation framework for producing synthetic data of adequate quality and quantity for TL research. The framework does not require profound building simulation knowledge to generate large volumes of data. BuilDa uses a single-zone Modelica model that is exported as a Functional Mock-up Unit (FMU) and simulated in Python. We demonstrate BuilDa by generating data and utilizing it for pretraining and fine-tuning TL models.


Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset

arXiv.org Artificial Intelligence

How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.


US Dept of Energy partners with AMD to build two supercomputers: Report

Al Jazeera

The United States has formed a $1bn partnership with Advanced Micro Devices (AMD) to construct two supercomputers that will tackle large scientific problems ranging from nuclear power to cancer treatments to national security. The Reuters news agency first reported the new partnership, citing Energy Secretary Chris Wright and AMD CEO Lisa Su. The machines can accelerate the process of making scientific discoveries in areas the US is focused on. Energy Secretary Wright said the systems would "supercharge" advances in nuclear power and fusion energy, technologies for defence and national security, and the development of drugs. Scientists and companies are trying to replicate fusion, the reaction that fuels the sun, by jamming light atoms in a plasma gas under intense heat and pressure to release massive amounts of energy.


Baseus Security S2 Outdoor Camera 4K review: It sees the light

PCWorld

When you purchase through links in our articles, we may earn a small commission. A solar panel on a security cam is nothing new, but the panel on this one tracks the sun, rotating to gain maximum exposure. If you can mount it where it can harvest a steady supply of sunlight, the Baseus Security S2 Outdoor Camera 4K's tracking solar panel makes it one of the few outdoor cameras that can run truly unattended, capturing crisp 4K-resolution video as a bonus. For many households, outdoor cameras are the front line of home security. The devices watch over driveways, porches, and backyards, catching activity that doorbell cameras often miss.


The great wildebeest migration, seen from space: satellites and AI are helping count Africa's wildlife

AIHub

The great wildebeest migration, seen from space: satellites and AI are helping count Africa's wildlife The Great Wildebeest Migration is one of the most remarkable natural spectacles on Earth. Each year, immense herds of wildebeest, joined by zebras and gazelles, travel 800-1,000km between Tanzania and Kenya in search of fresh grazing after the rains . This vast, circular journey is the engine of the Serengeti-Mara ecosystem. The migration feeds predators such as lions and crocodiles, fertilises the land and sustains the grasslands. Countless other species, and human livelihoods tied to rangelands and tourism, depend on it.


Inside the Data Centers That Train A.I. and Drain the Electrical Grid

The New Yorker

A data center, which can use as much electricity as Philadelphia, is the new American factory, creating the future and propping up the economy. "I do guess that a lot of the world gets covered in data centers," Sam Altman, the C.E.O. of OpenAI, has said. Drive in almost any direction from almost any American city, and soon enough you'll arrive at a data center--a giant white box rising from graded earth, flanked by generators and fenced like a prison yard. Data centers for artificial intelligence are the new American factory. Packed with computing equipment, they absorb information and emit A.I. Since the launch of ChatGPT, in 2022, they have begun to multiply at an astonishing rate. "I do guess that a lot of the world gets covered in data centers over time," Sam Altman, the C.E.O. of OpenAI, recently said. The leading independent operator of A.I. data centers in the United States is CoreWeave, which was founded eight years ago, as a casual experiment. In 2017, traders at a middling New York hedge fund decided to begin mining cryptocurrency, which they used as the entry fee for their fantasy-football league. To mine the crypto, they bought a graphics-processing unit, a powerful microchip made by the company Nvidia. The G.P.U. was marketed to video gamers, but Nvidia offered software that turned it into a low-budget supercomputer. "It was so successful, from a return-of-capital perspective, that we started scaling it," Brian Venturo, one of CoreWeave's co-founders, told me. "If you make your money back in, like, five days, you want to do that a lot." Within a year, the traders had quit the hedge-fund business and bought several thousand G.P.U.s, which they ran from Venturo's grandfather's garage, in New Jersey.


In Russia's 'blitz' of Ukraine, the question of appeasement is back

BBC News

In Russia's'blitz' of Ukraine, the question of appeasement is back Following another week of intensive and lethal Russian bombardment of Ukraine's cities, a composite image has been doing the rounds on Ukrainian social media. Underneath an old, black-and-white photo of Londoners queuing at a fruit and vegetable stall surrounded by the bombed-out rubble of the Blitz, a second image - this time in colour - creates a striking juxtaposition. Taken on Saturday, it shows shoppers thronging to similar stalls in a northern suburb of the Ukrainian capital, Kyiv, while a column of black smoke rises ominously in the background. Bombs can't stop markets, reads the caption linking the two images. The night before, as the city's sleep was interrupted once again by the now all-too-familiar booms of missile and drone strikes, two people were killed and nine others injured.


Cost Minimization for Space-Air-Ground Integrated Multi-Access Edge Computing Systems

arXiv.org Artificial Intelligence

Space-air-ground integrated multi-access edge computing (SAGIN-MEC) provides a promising solution for the rapidly developing low-altitude economy (LAE) to deliver flexible and wide-area computing services. However, fully realizing the potential of SAGIN-MEC in the LAE presents significant challenges, including coordinating decisions across heterogeneous nodes with different roles, modeling complex factors such as mobility and network variability, and handling real-time decision-making under partially observable environment with hybrid variables. To address these challenges, we first present a hierarchical SAGIN-MEC architecture that enables the coordination between user devices (UDs), uncrewed aerial vehicles (UAVs), and satellites. Then, we formulate a UD cost minimization optimization problem (UCMOP) to minimize the UD cost by jointly optimizing the task offloading ratio, UAV trajectory planning, computing resource allocation, and UD association. We show that the UCMOP is an NP-hard problem. To overcome this challenge, we propose a multi-agent deep deterministic policy gradient (MADDPG)-convex optimization and coalitional game (MADDPG-COCG) algorithm. Specifically, we employ the MADDPG algorithm to optimize the continuous temporal decisions for heterogeneous nodes in the partially observable SAGIN-MEC system. Moreover, we propose a convex optimization and coalitional game (COCG) method to enhance the conventional MADDPG by deterministically handling the hybrid and varying-dimensional decisions. Simulation results demonstrate that the proposed MADDPG-COCG algorithm significantly enhances the user-centric performances in terms of the aggregated UD cost, task completion delay, and UD energy consumption, with a slight increase in UAV energy consumption, compared to the benchmark algorithms. Moreover, the MADDPG-COCG algorithm shows superior convergence stability and scalability.


Multivariate Latent Recalibration for Conditional Normalizing Flows

arXiv.org Artificial Intelligence

Reliably characterizing the full conditional distribution of a multivariate response variable given a set of covariates is crucial for trustworthy decision-making. However, misspecified or miscalibrated multivariate models may yield a poor approximation of the joint distribution of the response variables, leading to unreliable predictions and suboptimal decisions. Furthermore, standard recalibration methods are primarily limited to univariate settings, while conformal prediction techniques, despite generating multivariate prediction regions with coverage guarantees, do not provide a full probability density function. We address this gap by first introducing a novel notion of latent calibration, which assesses probabilistic calibration in the latent space of a conditional normalizing flow. Second, we propose latent recalibration (LR), a novel post-hoc model recalibration method that learns a transformation of the latent space with finite-sample bounds on latent calibration. Unlike existing methods, LR produces a recalibrated distribution with an explicit multivariate density function while remaining computationally efficient. Extensive experiments on both tabular and image datasets show that LR consistently improves latent calibration error and the negative log-likelihood of the recalibrated models.