Precision-Scalable Microscaling Datapaths with Optimized Reduction Tree for Efficient NPU Integration