Goto

Collaborating Authors

 Samei, Golnoosh


ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

arXiv.org Artificial Intelligence

Large Language Models (LLMs) with billions of parameters have drastically transformed AI applications. However, their demanding computation during inference has raised significant challenges for deployment on resource-constrained devices. Despite recent trends favoring alternative activation functions such as GELU or SiLU, known for increased computation, this study strongly advocates for reinstating ReLU activation in LLMs. We demonstrate that using the ReLU activation function has a negligible impact on convergence and performance while significantly reducing computation and weight transfer. This reduction is particularly valuable during the memory-bound inference step, where efficiency is paramount. Exploring sparsity patterns in ReLU-based LLMs, we unveil the reutilization of activated neurons for generating new tokens and leveraging these insights, we propose practical strategies to substantially reduce LLM inference computation up to three times, using ReLU activations with minimal performance trade-offs.


A deep learning-based method for prostate segmentation in T2-weighted magnetic resonance imaging

arXiv.org Machine Learning

We propose a novel automatic method for accurate segmentation of the prostate in T2-weighted magnetic resonance imaging (MRI). Our method is based on convolutional neural networks (CNNs). Because of the large variability in the shape, size, and appearance of the prostate and the scarcity of annotated training data, we suggest training two separate CNNs. A global CNN will determine a prostate bounding box, which is then resampled and sent to a local CNN for accurate delineation of the prostate boundary. This way, the local CNN can effectively learn to segment the fine details that distinguish the prostate from the surrounding tissue using the small amount of available training data. To fully exploit the training data, we synthesize additional data by deforming the training images and segmentations using a learned shape model. We apply the proposed method on the PROMISE12 challenge dataset and achieve state of the art results. Our proposed method generates accurate, smooth, and artifact-free segmentations. On the test images, we achieve an average Dice score of 90.6 with a small standard deviation of 2.2, which is superior to all previous methods. Our two-step segmentation approach and data augmentation strategy may be highly effective in segmentation of other organs from small amounts of annotated medical images.