From Models to Operators: Rethinking Autoscaling Granularity for Large Generative Models

Open in new window