Plotting

Unlocking Fairness: a Trade-off Revisited

Neural Information Processing Systems

The prevailing wisdom is that a model's fairness and its accuracy are in tension with one another. However, there is a pernicious modeling-evaluating dualism bedeviling fair machine learning in which phenomena such as label bias are appropriately acknowledged as a source of unfairness when designing fair models, only to be tacitly abandoned when evaluating them. We investigate fairness and accuracy, but this time under a variety of controlled conditions in which we vary the amount and type of bias. We find, under reasonable assumptions, that the tension between fairness and accuracy is illusive, and vanishes as soon as we account for these phenomena during evaluation. Moreover, our results are consistent with an opposing conclusion: fairness and accuracy are sometimes in accord. This raises the question, might there be a way to harness fairness to improve accuracy after all? Since many notions of fairness are with respect to the model's predictions and not the ground truth labels, this provides an opportunity to see if we can improve accuracy by harnessing appropriate notions of fairness over large quantities of unlabeled data with techniques like posterior regularization and generalized expectation. We find that semi-supervision improves both accuracy and fairness while imparting beneficial properties of the unlabeled data on the classifier.


GACL: Exemplar-Free Generalized Analytic Continual Learning Yizhu Chen 1

Neural Information Processing Systems

Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplarfree GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical (i.e., closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges to adopting our GACL. Theoretically, this equivalence property is validated through matrix analysis tools. Empirically, we conduct extensive experiments where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings.



Codec Avatar Studio: Paired Human Captures for Complete, Driveable, and Generalizable Avatars Julieta Martinez 1, Emily Kim 2, Javier Romero

Neural Information Processing Systems

To create photorealistic avatars that users can embody, human modeling must be complete (encompass the full body), driveable (able to reproduce motion of the user from lightweight sensors), and generalizable (i.e., easily adaptable to novel identities). Towards these goals, paired captures, that is, captures of the same subject obtained from systems of diverse quality and availability, are crucial. However, paired captures are rarely available to researchers outside of dedicated industrial labs: Codec Avatar Studio is our proposal to close this gap. Towards generalization and driveability, we introduce a dataset of 256 subjects captured in two modalities: high resolution multi-view scans of their heads, and video from the internal cameras of a headset. Towards completeness, we introduce a dataset of 4 subjects captured in eight modalities: high quality relightable multi-view captures of heads and hands, full body multi-view captures with minimal and regular clothes, and corresponding head, hands and body phone captures. Together with our data, we also provide code and pre-trained models for different state-of-the-art human generation models.


We thank all reviewers for their positive and constructive comments, such as the application important, the results impressive, example images

Neural Information Processing Systems

Below we first address the common questions, and then questions by individual reviewers. Second, the weights in SPADE are determined on the fly (Fig. A(a)). This operation could be considered as a 1x1 convolution with a group size equal to the channel size. SPADE layers, which in turn generate spatially adaptive de-modulation parameters. We will include several failure examples in the revised version.


Error Bounds of Imitating Policies and Environments

Neural Information Processing Systems

Imitation learning trains a policy by mimicking expert demonstrations. V arious imitation methods were proposed and empirically evaluated, meanwhile, their theoretical understanding needs further studies. In this paper, we firstly analyze the value gap between the expert policy and imitated policies by two imitation methods, behavioral cloning and generative adversarial imitation.


Gradient-free Decoder Inversion in Latent Diffusion Models Kyeonghyun Lee 1 Ernest K. Ryu

Neural Information Processing Systems

In latent diffusion models (LDMs), denoising diffusion process efficiently takes place on latent space whose dimension is lower than that of pixel space. Decoder is typically used to transform the representation in latent space to that in pixel space. While a decoder is assumed to have an encoder as an accurate inverse, exact encoder-decoder pair rarely exists in practice even though applications often require precise inversion of decoder. In other words, encoder is not the left-inverse but the right-inverse of the decoder; decoder inversion seeks the left-inverse. Prior works for decoder inversion in LDMs employed gradient descent inspired by inversions of generative adversarial networks. However, gradient-based methods require larger GPU memory and longer computation time for larger latent space.


SAM-Guided Masked Token Prediction for 3D Scene Understanding Liang Yang

Neural Information Processing Systems

Foundation models have significantly enhanced 2D task performance, and recent works like Bridge3D have successfully applied these models to improve 3D scene understanding through knowledge distillation, marking considerable advancements. Nonetheless, challenges such as the misalignment between 2D and 3D representations and the persistent long-tail distribution in 3D datasets still restrict the effectiveness of knowledge distillation from 2D to 3D using foundation models. To tackle these issues, we introduce a novel SAM-guided tokenization method that seamlessly aligns 3D transformer structures with region-level knowledge distillation, replacing the traditional KNN-based tokenization techniques. Additionally, we implement a group-balanced re-weighting strategy to effectively address the long-tail problem in knowledge distillation. Furthermore, inspired by the recent success of masked feature prediction, our framework incorporates a two-stage masked token prediction process in which the student model predicts both the global embeddings and the token-wise local embeddings derived from the teacher models trained in the first stage. Our methodology has been validated across multiple datasets, including SUN RGB-D, ScanNet, and S3DIS, for tasks like 3D object detection and semantic segmentation. The results demonstrate significant improvements over current State-of-the-art self-supervised methods, establishing new benchmarks in this field.


Supplementary Material for " A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions "

Neural Information Processing Systems

A.1 Stochastic approximation Stochastic approximation [Benveniste et al., 1990] provides a standard framework for the development of adaptive algorithms. Given a random field function H(θ, x), the goal of the stochastic approximation algorithm is to find the solution to the mean-field equation h(θ) = 0, i.e., solving h(θ) = H(θ, x)ϖ


b5b8c484824d8a06f4f3d570bc420313-AuthorFeedback.pdf

Neural Information Processing Systems

We thank all the reviewers for the valuable comments. Advantages of CSGLD over M-SGD: (i) CSGLD belongs to the class of adaptive biasing force algorithms and Empirically, we suggest to partition the sample space into a moderate number of subregions, e.g. Drawbacks of simulated annealing (SA) and replica exchange SGLD (reSGLD)/parallel tempering: SA can only be Q2. Missing baselines: We further compared CSGLD with CyclicalSGLD and reSGLD on an asymmetric mixture We will include the baselines and references in the next version. The gradient-vanishing problem in SGLD is not clear: Please refer to our reply to Q1 of Reviewer 1. Q1. Comments on bizarre peaks: A bizarre peak always indicates that there is a local minimum of the same energy in Q3.