On the Reliability Limits of LLM-Based Multi-Agent Planning