FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration