Appendix
–Neural Information Processing Systems
A.1 Selection of substitution rate p We observed when the value of p is within (0, 0.7), there exists a correlation between the S Performing a grid search on each task using diffusion models is an expensive process. However, it has been observed that an increase in the value of p leads to a deviation between the two. This could be attributed to a higher conversion error that occurs when p is excessively large. A.2 Selection of number of latent code k The parameter k determines the number of latent codes Figure 4: Impact of the proportion of injected used to represent a paragraph and therefore controls the noise for learning Paragraph Embeddings compression level. To determine the best set of latent codes, we conducted experiments using three different methods: 1) selecting the first k hidden vectors, 2) selecting the last k hidden vectors, and 3) selecting interleaving hidden vectors, one for every L k hidden vectors. The results of the ablation study are presented in Table 5. Based on our findings, we observed no significant difference among the different choices, so we opted for option 1). Furthermore, we discovered that increasing the value of k does not lead to a dramatic improvement in performance. To balance between efficiency and performance, in most of our study we only use k = 16 Setup BLEU_clean BLEU_robust First k (k=16) 79.59 43.17 A.3 Reconstruction, denoising and interpolation examples In Table 6, we present examples that demonstrate the adeptness of the trained Variational Paragraph Embedder in providing clean and denoised reconstructions. Additionally, we showcase interpolation results (Table 7, 8) derived from two random sentences in the hotel review dataset. The interpolated paragraph is usually coherent and incorporates inputs from both sentences, characterizing the distributional smoothness of the latent space. Reconstructed complaints: after two nights stay, i asked the maid to clean our room (empty the wastebasket & make the bed). Denoising reconstruction (hotel review), noise level 0.3 Original * * * check out the bathroom picture * * * i was in nyc by myself to watch some friends participate in the us olympic text marathon trials. Corrupted * * [unused697] check exams the bathroom picture * * slams i was in nyc mead myself yankee 2016 some scotch text ruin in the outfielder olympicnca trials. Reconstructed ***check out the bathroom picture*** i was in nyc with my husband and some friends staying in the hudson hotel in text nyc. Table 6: Reconstruction examples for clean reconstruction where input is not corrupted and denoising reconstruction where input is corrupted with 30% substitution noise. The mismatched text in the clean reconstruction is in red. We provide generation examples for both summarization and sentiment-guided generation in Table 9 and Table 10.
Neural Information Processing Systems
Feb-11-2025, 17:35:07 GMT
- Country:
- Asia (0.93)
- North America > United States
- Maryland > Prince George's County (0.28)
- Genre:
- Research Report > New Finding (0.86)
- Industry:
- Consumer Products & Services (1.00)
- Government > Regional Government
- Law > Criminal Law (0.93)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Materials > Metals & Mining (0.67)
- Transportation
- Ground > Road (0.68)
- Infrastructure & Services (0.68)
- Technology: