Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation