Benchmarking Robustness to Adversarial Image Obfuscations

Neural Information Processing Systems 

Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images (e.g.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found