VLM in a flash: I/O-Efficient Sparsification of Vision-Language Model via Neuron Chunking

Open in new window