Beyond Classification: Evaluating LLMs for Fine-Grained Automatic Malware Behavior Auditing