Hassanpour, Negar
Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone
Hassanpour, Negar, Janjua, Muhammad Kamran, Zhang, Kunlin, Lavasani, Sepehr, Zhang, Xiaowen, Zhou, Chunhua, Gao, Chao
Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.
Qua$^2$SeDiMo: Quantifiable Quantization Sensitivity of Diffusion Models
Mills, Keith G., Salameh, Mohammad, Chen, Ruichen, Hassanpour, Negar, Lu, Wei, Niu, Di
Diffusion Models (DM) have democratized AI image generation through an iterative denoising process. Quantization is a major technique to alleviate the inference cost and reduce the size of DM denoiser networks. However, as denoisers evolve from variants of convolutional U-Nets toward newer Transformer architectures, it is of growing importance to understand the quantization sensitivity of different weight layers, operations and architecture types to performance. In this work, we address this challenge with Qua$^2$SeDiMo, a mixed-precision Post-Training Quantization framework that generates explainable insights on the cost-effectiveness of various model weight quantization methods for different denoiser operation types and block structures. We leverage these insights to make high-quality mixed-precision quantization decisions for a myriad of diffusion models ranging from foundational U-Nets to state-of-the-art Transformers. As a result, Qua$^2$SeDiMo can construct 3.4-bit, 3.9-bit, 3.65-bit and 3.7-bit weight quantization on PixArt-${\alpha}$, PixArt-${\Sigma}$, Hunyuan-DiT and SDXL, respectively. We further pair our weight-quantization configurations with 6-bit activation quantization and outperform existing approaches in terms of quantitative metrics and generative image quality.
PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation
Jiang, Liyao, Hassanpour, Negar, Salameh, Mohammad, Samadi, Mohammadreza, He, Jiao, Sun, Fengyu, Niu, Di
Recent research explores the potential of Diffusion Models (DMs) for consistent object editing, which aims to modify object position, size, and composition, etc., while preserving the consistency of objects and background without changing their texture and attributes. Current inference-time methods often rely on DDIM inversion, which inherently compromises efficiency and the achievable consistency of edited images. Recent methods also utilize energy guidance which iteratively updates the predicted noise and can drive the latents away from the original image, resulting in distortions. In this paper, we propose PixelMan, an inversion-free and training-free method for achieving consistent object editing via Pixel Manipulation and generation, where we directly create a duplicate copy of the source object at target location in the pixel space, and introduce an efficient sampling approach to iteratively harmonize the manipulated object into the target location and inpaint its original location, while ensuring image consistency by anchoring the edited image to be generated to the pixel-manipulated image as well as by introducing various consistency-preserving optimization techniques during inference. Experimental evaluations based on benchmark datasets as well as extensive visual comparisons show that in as few as 16 inference steps, PixelMan outperforms a range of state-of-the-art training-based and training-free methods (usually requiring 50 steps) on multiple consistent object editing tasks.
Variational Auto-Encoder Architectures that Excel at Causal Inference
Hassanpour, Negar, Greiner, Russell
Estimating causal effects from observational data (at either an individual -- or a population -- level) is critical for making many types of decisions. One approach to address this task is to learn decomposed representations of the underlying factors of data; this becomes significantly more challenging when there are confounding factors (which influence both the cause and the effect). In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders to simultaneously learn those underlying factors as well as the causal effects. We propose a progressive sequence of models, where each improves over the previous one, culminating in the Hybrid model. Our empirical results demonstrate that the performance of all three proposed models are superior to both state-of-the-art discriminative as well as other generative approaches in the literature.
Reducing Selection Bias in Counterfactual Reasoning for Individual Treatment Effects Estimation
Zhang, Zichen, Lan, Qingfeng, Ding, Lei, Wang, Yue, Hassanpour, Negar, Greiner, Russell
Counterfactual reasoning is an important paradigm applicable in many fields, such as healthcare, economics, and education. In this work, we propose a novel method to address the issue of \textit{selection bias}. We learn two groups of latent random variables, where one group corresponds to variables that only cause selection bias, and the other group is relevant for outcome prediction. They are learned by an auto-encoder where an additional regularized loss based on Pearson Correlation Coefficient (PCC) encourages the de-correlation between the two groups of random variables. This allows for explicitly alleviating selection bias by only keeping the latent variables that are relevant for estimating individual treatment effects. Experimental results on a synthetic toy dataset and a benchmark dataset show that our algorithm is able to achieve state-of-the-art performance and improve the result of its counterpart that does not explicitly model the selection bias.