A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models