Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR

Open in new window