Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs

Open in new window