TPLA: Tensor Parallel Latent Attention for Efficient Disaggregated Prefill and Decode Inference