Goto

Collaborating Authors

 vpp



VPP: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation

Neural Information Processing Systems

Conditional 3D generation is undergoing a significant advancement, enabling the free creation of 3D content from inputs such as text or 2D images. However, previous approaches have suffered from low inference efficiency, limited generation categories, and restricted downstream applications. In this work, we revisit the impact of different 3D representations on generation quality and efficiency. We propose a progressive generation method through Voxel-Point Progressive Representation (VPP). VPP leverages structured voxel representation in the proposed Voxel Semantic Generator and the sparsity of unstructured point representation in the Point Upsampler, enabling efficient generation of multi-category objects. VPP can generate high-quality 8K point clouds within 0.2 seconds. Additionally, the masked generation Transformer allows for various 3D downstream tasks, such as generation, editing, completion, and pre-training. Extensive experiments demonstrate that VPP efficiently generates high-fidelity and diverse 3D shapes across different categories, while also exhibiting excellent representation transfer performance. Codes will be released at https://github.com/qizekun/VPP.



Visual Position Prompt for MLLM based Visual Grounding

arXiv.org Artificial Intelligence

Although Multimodal Large Language Models (MLLMs) excel at various image-related tasks, they encounter challenges in precisely aligning coordinates with spatial information within images, particularly in position-aware tasks such as visual grounding. This limitation arises from two key factors. First, MLLMs lack explicit spatial references, making it difficult to associate textual descriptions with precise image locations. Second, their feature extraction processes prioritize global context over fine-grained spatial details, leading to weak localization capability. To address this issue, we introduce VPP-LLaVA, an MLLM equipped with Visual Position Prompt (VPP) to improve its grounding capability. VPP-LLaVA integrates two complementary mechanisms. The global VPP overlays learnable, axis-like embeddings onto the input image to provide structured spatial cues. The local VPP focuses on fine-grained localization by incorporating position-aware queries, which suggests probable object locations. We also introduce a VPP-SFT dataset with 0.6M samples, consolidating high-quality visual grounding data into a compact format for efficient model training. Training on this dataset with VPP enhances the model's performance, achieving state-of-the-art results on standard grounding benchmarks despite using fewer training samples compared to other MLLMs like MiniGPT-v2, which rely on much larger datasets ($\sim$21M samples). The code and VPP-SFT dataset will be available at https://github.com/WayneTomas/VPP-LLaVA upon acceptance.


The Download: Google's Gemini plans, and virtual power plants

MIT Technology Review

The news: In the biggest mass-market AI launch yet, Google is rolling out Gemini, its family of large language models, across almost all its products, from Android to the iOS Google app to Gmail to Docs and more. A new subscription plan will also give users access to Gemini Ultra, the most powerful version of the model, for the first time. Why it matters: ChatGPT, released by Microsoft-backed OpenAI just 14 months ago, changed people's expectations of what computers could do. Google has been racing to catch up ever since and unveiled its Gemini family of models in December. By baking Gemini into its ubiquitous tools, it will be hoping to make up any lost ground, and even overtake its rival.


A Framework for Partially Observed Reward-States in RLHF

arXiv.org Artificial Intelligence

The study of reinforcement learning from human feedback (RLHF) has gained prominence in recent years due to its role in the development of LLMs. Neuroscience research shows that human responses to stimuli are known to depend on partially-observed "internal states." Unfortunately current models of RLHF do not take take this into consideration. Moreover most RLHF models do not account for intermediate feedback, which is gaining importance in empirical work and can help improve both sample complexity and alignment. To address these limitations, we model RLHF as reinforcement learning with partially observed reward-states (PORRL). We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL. For cardinal feedback, we develop generic statistically efficient algorithms and instantiate them to present POR-UCRL and POR-UCBVI. For dueling feedback, we show that a naive reduction to cardinal feedback fails to achieve sublinear dueling regret. We then present the first explicit reduction that converts guarantees for cardinal regret to dueling regret. We show that our models and guarantees in both settings generalize and extend existing ones. Finally, we identify a recursive structure on our model that could improve the statistical and computational tractability of PORRL, giving examples from past work on RLHF as well as learning perfect reward machines, which PORRL subsumes.


Machine Learning Infused Distributed Optimization for Coordinating Virtual Power Plant Assets

arXiv.org Artificial Intelligence

Amid the increasing interest in the deployment of Distributed Energy Resources (DERs), the Virtual Power Plant (VPP) has emerged as a pivotal tool for aggregating diverse DERs and facilitating their participation in wholesale energy markets. These VPP deployments have been fueled by the Federal Energy Regulatory Commission's Order 2222, which makes DERs and VPPs competitive across market segments. However, the diversity and decentralized nature of DERs present significant challenges to the scalable coordination of VPP assets. To address efficiency and speed bottlenecks, this paper presents a novel machine learning-assisted distributed optimization to coordinate VPP assets. Our method, named LOOP-MAC(Learning to Optimize the Optimization Process for Multi-agent Coordination), adopts a multi-agent coordination perspective where each VPP agent manages multiple DERs and utilizes neural network approximators to expedite the solution search. The LOOP-MAC method employs a gauge map to guarantee strict compliance with local constraints, effectively reducing the need for additional post-processing steps. Our results highlight the advantages of LOOP-MAC, showcasing accelerated solution times per iteration and significantly reduced convergence times. The LOOP-MAC method outperforms conventional centralized and distributed optimization methods in optimization tasks that require repetitive and sequential execution.


Safe Reinforcement Learning for Strategic Bidding of Virtual Power Plants in Day-Ahead Markets

arXiv.org Artificial Intelligence

For this reason, their applicability in practice is limited. Growing environmental concerns and advancements in communication The above-mentioned scalability issues can be addressed and monitoring technologies have led to the increased by employing deep RL methods like the Deep Deterministic deployment of Distributed Energy Resources (DERs) Policy Gradient (DDPG) algorithm [10], which utilizes neural in power networks [1], comprising renewable energy sources networks to extend the Q-learning capabilities to continuous and prosumers. The market integration of these units is facilitated state and action spaces. The authors in [11]-[13] propose deep by their large-scale aggregation under financial entities, RL methods for the economic dispatch and market participation commonly known as Virtual Power Plants (VPPs), which have of DERs aggregated in a VPP. The main limitation the capacity for trading in wholesale electricity markets [2], of these works is that they fail to account for the complex [3]. As a self-interested market participant, a VPP aims at internal physical constraints of large-scale VPPs, such as maximizing its own profit generated by its market participation power generation limits and power flow constraints, in order to and the fulfillment of contractual obligations towards its ensure a safe operation.


Aggregating Electric Cars to Sustainable Virtual Power Plants: The Value of Flexibility in Future Electricity Markets

AAAI Conferences

Electric vehicles will play a crucial role in balancing the future electrical grid, which is complicated by many intermittent renewable energy sources. We developed an algorithm that determines for a fleet of electric vehicles, which EV at what price and location to commit to the operating reserve market to either absorb excess capacity or provide electricity during shortages (vehicle-2-grid). The algorithm takes the value of immobility into account by using carsharing fees as a reference point. A virtual power plant autonomously replaces cars that are committed to the operating reserves and are then rented out, with other idle cars to pool the risks of uncertainty. We validate our model with data from a free float carsharing fleet of 500 electric vehicles. An analysis of expected future developments (2015, 2018, and 2022) in operating reserve demand and battery costs yields that the gross profits for a carsharing operator increase between 7-12% with a negligible decrease in car availability (<0.01%).