Scaling On-Device GPU Inference for Large Generative Models

Open in new window