Sharing Key Semantics in Transformer Makes Efficient Image Restoration
–Neural Information Processing Systems
Image Restoration (IR), a classic low-level vision task, has witnessed significant advancements through deep models that effectively model global information. Notably, the emergence of Vision Transformers (ViTs) has further propelled these advancements. When computing, the self-attention mechanism, a cornerstone of ViTs, tends to encompass all global cues, even those from semantically unrelated objects or regions. This inclusivity introduces computational inefficiencies, particularly noticeable with high input resolution, as it requires processing irrelevant information, thereby impeding efficiency. Additionally, for IR, it is commonly noted that small segments of a degraded image, particularly those closely aligned semantically, provide particularly relevant information to aid in the restoration process, as they contribute essential contextual cues crucial for accurate reconstruction. To address these challenges, we propose boosting IR's performance by sharing the key semantics via Transformer for IR (i.e., SemanIR) in this paper.
Neural Information Processing Systems
May-20-2025, 21:38:04 GMT
- Country:
- Asia > China (0.14)
- Europe (0.14)
- North America > United States
- California (0.14)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Education (0.45)
- Technology: