Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs