Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference