...which might have something to do with autoregressive language models being more popular than encoder/​decoder ones.
...which might have something to do with autoregressive language models being more popular than encoder/​decoder ones.