Open links in new tab
  1. 10.6. The Encoder–Decoder Architecture — Dive into Deep ... - D2L

    Encoder-decoder architectures can handle inputs and outputs that both consist of variable-length sequences and thus are suitable for sequence-to-sequence problems such as machine translation.

  2. 11.7. The Transformer Architecture — Dive into Deep Learning 1. ... - D2L

    Let’s instantiate an encoder–decoder model by following the Transformer architecture. Here we specify that both the Transformer encoder and the Transformer decoder have two layers using 4-head …

  3. 9.6. 编码器-解码器架构 — 动手学深度学习 2.0.0 documentation

    为了处理这种类型的输入和输出, 我们可以设计一个包含两个主要组件的架构: 第一个组件是一个 编码器 (encoder): 它接受一个长度可变的序列作为输入, 并将其转换为具有固定形状的编码状态。

  4. 9.6. Encoder-Decoder Architecture — Dive into Deep Learning 0. ... - D2L

    In the end, the encoder-decoder architecture contains both an encoder and a decoder, with optionally extra arguments. In the forward propagation, the output of the encoder is used to produce the …

  5. 10.7. Sequence-to-Sequence Learning for Machine Translation - D2L

    In this section, we will demonstrate the application of an encoder–decoder architecture, where both the encoder and decoder are implemented as RNNs, to the task of machine translation (Cho et al., 2014, …

  6. 11.9. Large-Scale Pretraining with Transformers - D2L

    Originally proposed for machine translation, the Transformer architecture in Fig. 11.7.1 consists of an encoder for representing input sequences and a decoder for generating target sequences.

  7. 11.4. The Bahdanau Attention Mechanism — Dive into Deep ... - D2L

    When we encountered machine translation in Section 10.7, we designed an encoder–decoder architecture for sequence-to-sequence learning based on two RNNs (Sutskever et al., 2014).

  8. 9.7. Sequence to Sequence Learning — Dive into Deep Learning 0.

    Following the design principle of the encoder-decoder architecture, the RNN encoder can take a variable-length sequence as the input and transforms it into a fixed-shape hidden state. In other …

  9. 11. Attention Mechanisms and Transformers — Dive into Deep ... - D2L

    The core idea behind the Transformer model is the attention mechanism, an innovation that was originally envisioned as an enhancement for encoder–decoder RNNs applied to sequence-to …

  10. 11.8. Transformers for Vision — Dive into Deep Learning 1.0.3 ... - D2L

    This architecture consists of a stem that patchifies images, a body based on the multilayer Transformer encoder, and a head that transforms the global representation into the output label.