Goto

Collaborating Authors

 xxx


TextDiffuser: Diffusion Models as Text Painters

Neural Information Processing Systems

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images with visually appealing text that is coherent with backgrounds. TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout. Additionally, we contribute the first large-scale text images dataset with OCR annotations, MARIO-10M, containing 10 million image-text pairs with text recognition, detection, and character-level segmentation annotations. We further collect the MARIO-Eval benchmark to serve as a comprehensive tool for evaluating text rendering quality.



TextDiffuser: Diffusion Models as Text Painters

Neural Information Processing Systems

TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout.



3d4c0a618d0acd7921493e4f30395c22-Paper-Conference.pdf

Neural Information Processing Systems

Giventextual explanations, our proposed framework uses agenerative model conditioned on textual input to create data points representing the explanations. Bycomparing theneuron'sresponse tothese generated data points and control data points, we can estimate the quality of the explanation.




Position-basedScaledGradientforModel QuantizationandPruning-Appendix

Neural Information Processing Systems

Inthis experiment, we only quantize the weights, not the activations, to compare the performance degradation as weight bit-width decreases. The mean squared errors (MSE) of the weights across different bit-widths are also reported. The name of the layer and the number of parameters in parenthesis are shown in the column. All numbers are results of the last epoch. Table A3: ResNet-32 trained with Adam on the CIFAR-100 dataset.


4aa13186c795a52ba88f5b822f4b77eb-Paper-Conference.pdf

Neural Information Processing Systems

Therefore, estimating how well a given model might perform on the new data is an important step toward reliable ML applications. This isverychallenging, however,asthedata distribution can change inflexible ways, and we may not haveanylabels on the new data, which is often the case in monitoring settings. In this paper, we propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and afew features.


45c166d697d65080d54501403b433256-AuthorFeedback.pdf

Neural Information Processing Systems

The reviewers2 acknowledge that the ideas presented inthe paper are compelling, sound and appear tobeeffective(R3), offering a3 great add to the GP literature (R1) which is also supported by a solid and an interesting theoretical foundation (R2,4 R4). Existing multi-output GP models are not applicable to our setting (see line 79-83) and are thus not16 comparabletotheDAG-GPmodel. Wehavefurther clarified this point in Section 1.2.