Experiment with Billion-Parameter Models Faster using DeepSpeed and Meta Tensors
Imagine a situation where you want to know the output shape from an operation. Typically, you would run the operation and check the size of the tensor after. With Meta Tensors, you don't have to compute the output to find the answer. Meta Tensors are just like normal tensors, except they have no data. In PyTorch meta is a device.
Apr-19-2022, 15:29:29 GMT
- Technology: