GNMT used seq2seq with attention to do translations. GNMT plus some RNN and attention lead to transformers, and here we are today.
GNMT used seq2seq with attention to do translations. GNMT plus some RNN and attention lead to transformers, and here we are today.