Goto

Collaborating Authors

 neighbor





Neural Modulation for Flash Memory: An Unsupervised Learning Framework for Improved Reliability

Neural Information Processing Systems

The continued scaling of flash memory technology into smaller process nodes, combined with the increased information capacity of each flash cell (i.e, storing more bits per cell), has placed NAND flash memory at the forefront of modern storage technology.



Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels

Neural Information Processing Systems

However, these methods typically rely on strict assumptions and are limited to certain types of label noise. In this paper, we reformulate the label-noise problem from a generative-model perspective, i.e., labels are generated by gradually refining an initial random guess.





c1f0b856a35986348ab3414177266f75-Paper-Conference.pdf

Neural Information Processing Systems

Large language models are now tuned to align with the goals of their creators, namely to be "helpful and harmless." These models should respond helpfully to user questions, but refuse to answer requests that could cause harm. However, adversarial users can construct inputs which circumvent attempts at alignment. In this work, we study adversarial alignment, and ask to what extent these models remain aligned when interacting with an adversarial user who constructs worst-case inputs (adversarial examples). These inputs are designed to cause the model to emit harmful content that would otherwise be prohibited. We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models: even when current NLP-based attacks fail, we can find adversarial inputs with brute force.