Report an issue

Practice quizzes for Australian citizenship, driving, and immigration topics

Quizzes

Transformers process the entire sequence in parallel during training, while RNNs process tokens sequentially

Transformers use less memory than RNNs at all sequence lengths

RNNs require labelled data while Transformers can train without labels

Transformers have fewer parameters, making them easier to train