Plotting

 North America



UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Neural Information Processing Systems

Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix [10] and MagicBrush [71], and provide a systematic approach to producing massive and highquality image editing samples.


Mutual Information Estimation via Normalizing Flows Tolmachev A. D. Skoltech, Sirius

Neural Information Processing Systems

We propose a novel approach to the problem of mutual information (MI) estimation via introducing a family of estimators based on normalizing flows. The estimator maps original data to the target distribution, for which MI is easier to estimate. We additionally explore the target distributions with known closed-form expressions for MI. Theoretical guarantees are provided to demonstrate that our approach yields MI estimates for the original data. Experiments with high-dimensional data are conducted to highlight the practical advantages of the proposed method.


Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator

Neural Information Processing Systems

Meta-reinforcement learning (Meta-RL) has attracted attention due to its capability to enhance reinforcement learning (RL) algorithms, in terms of data efficiency and generalizability. In this paper, we develop a bilevel optimization framework for meta-RL (BO-MRL) to learn the meta-prior for task-specific policy adaptation, which implements multiple-step policy optimization on one-time data collection. Beyond existing meta-RL analyses, we provide upper bounds of the expected optimality gap over the task distribution. This metric measures the distance of the policy adaptation from the learned meta-prior to the task-specific optimum, and quantifies the model's generalizability to the task distribution. We empirically validate the correctness of the derived upper bounds and demonstrate the superior effectiveness of the proposed algorithm over benchmarks.



BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling Lin Gui, Cristina Gârbacea 2, and Victor Veitch Department of Statistics, University of Chicago

Neural Information Processing Systems

This paper concerns the problem of aligning samples from large language models to human preferences using best-of-n sampling, where we draw n samples, rank them, and return the best one. We consider two fundamental problems. First: what is the relationship between best-of-n and approaches to alignment that train LLMs to output samples with a high expected reward (e.g., RLHF or DPO)? To answer this, we embed both the best-of-n distribution and the sampling distributions learned by alignment procedures in a common class of tiltings of the base LLM distribution. We then show that, within this class, best-of-n is essentially optimal in terms of the trade-off between win-rate against the base model vs KL distance from the base model.


Flatten Anything: Unsupervised Neural Surface Parameterization

Neural Information Processing Systems

Surface parameterization plays an essential role in numerous computer graphics and geometry processing applications. Traditional parameterization approaches are designed for high-quality meshes laboriously created by specialized 3D modelers, thus unable to meet the processing demand for the current explosion of ordinary 3D data. Moreover, their working mechanisms are typically restricted to certain simple topologies, thus relying on cumbersome manual efforts (e.g., surface cutting, part segmentation) for pre-processing. In this paper, we introduce the Flatten Anything Model (FAM), an unsupervised neural architecture to achieve global free-boundary surface parameterization via learning point-wise mappings between 3D points on the target geometric surface and adaptively-deformed UV coordinates within the 2D parameter domain. To mimic the actual physical procedures, we ingeniously construct geometrically-interpretable sub-networks with specific functionalities of surface cutting, UV deforming, unwrapping, and wrapping, which are assembled into a bi-directional cycle mapping framework.


The Surprising Effectiveness of SP Voting with Partial Preferences

Neural Information Processing Systems

We consider the problem of recovering the ground truth ordering (ranking, top-k, or others) over a large number of alternatives. The wisdom of crowd is a heuristic approach based on Condorcet's Jury theorem to address this problem through collective opinions. This approach fails to recover the ground truth when the majority of the crowd is misinformed. The surprisingly popular (SP) algorithm [36] is an alternative approach that is able to recover the ground truth even when experts are in minority. The SP algorithm requires the voters to predict other voters' report in the form of a full probability distribution over all rankings of alternatives.


A Walsh Hadamard Derived Linear Vector Symbolic Architecture

Neural Information Processing Systems

VSAs support the commutativity and associativity of this binding operation, along with an inverse operation, allowing one to construct symbolicstyle manipulations over real-valued vectors. Most VSAs were developed before deep learning and automatic differentiation became popular and instead focused on efficacy in hand-designed systems. In this work, we introduce the Hadamardderived linear Binding (HLB), which is designed to have favorable computational efficiency, and efficacy in classic VSA tasks, and perform well in differentiable systems.


Credit Attribution and Stable Compression Roi Livni Shay Moran

Neural Information Processing Systems

Credit attribution is crucial across various fields. In academic research, proper citation acknowledges prior work and establishes original contributions. Similarly, in generative models, such as those trained on existing artworks or music, it is important to ensure that any generated content influenced by these works appropriately credits the original creators. We study credit attribution by machine learning algorithms. We propose new definitions-relaxations of Differential Privacy-that weaken the stability guarantees for a designated subset of datapoints. These datapoints can be used non-stably with permission from their owners, potentially in exchange for compensation. Meanwhile, each of the remaining datapoints is guaranteed to have no significant influence on the algorithm's output. Our framework extends well-studied notions of stability, including Differential Privacy (= 0), differentially private learning with public data (where the public datapoints are fixed in advance), and stable sample compression (where the datapoints are selected adaptively by the algorithm). We examine the expressive power of these stability notions within the PAC learning framework, provide a comprehensive characterization of learnability for algorithms adhering to these principles, and propose directions and questions for future research.