Leveraging Hierarchical Image-Text Misalignment for Universal Fake Image Detection