xxx
Provable Gradient Editing of Deep Neural Networks
In explainable AI, DNN gradients are used to interpret the prediction; in safetycritical control systems, gradients could encode safety constraints; in scientificcomputing applications, gradients could encode physical invariants. While recent work on provable editing of DNNs has focused on input-output constraints, the problem of enforcing hard constraints on DNN gradients remains unaddressed. We present ProGrad, the first efficient approach for editing the parameters of a DNN to provably enforce hard constraints on the DNN gradients.
TextDiffuser: Diffusion Models as Text Painters
Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images with visually appealing text that is coherent with backgrounds. TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout. Additionally, we contribute the first large-scale text images dataset with OCR annotations, MARIO-10M, containing 10 million image-text pairs with text recognition, detection, and character-level segmentation annotations. We further collect the MARIO-Eval benchmark to serve as a comprehensive tool for evaluating text rendering quality.
Position-basedScaledGradientforModel QuantizationandPruning-Appendix
Inthis experiment, we only quantize the weights, not the activations, to compare the performance degradation as weight bit-width decreases. The mean squared errors (MSE) of the weights across different bit-widths are also reported. The name of the layer and the number of parameters in parenthesis are shown in the column. All numbers are results of the last epoch. Table A3: ResNet-32 trained with Adam on the CIFAR-100 dataset.
4aa13186c795a52ba88f5b822f4b77eb-Paper-Conference.pdf
Therefore, estimating how well a given model might perform on the new data is an important step toward reliable ML applications. This isverychallenging, however,asthedata distribution can change inflexible ways, and we may not haveanylabels on the new data, which is often the case in monitoring settings. In this paper, we propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and afew features.