github:
https://github.com/pandalabme/d2l/tree/main/exercises

1. Suppose that we use neural networks to implement the encoder–decoder architecture. Do the encoder and the decoder have to be the same type of neural network?

No, the encoder and the decoder do not have to be the same type of neural network. In fact, different types of neural networks may have different advantages and disadvantages for encoding and decoding tasks. For example, recurrent neural networks (RNNs) are good at capturing sequential information and long-term dependencies, but they are also prone to vanishing or exploding gradients and have high computational cost. Convolutional neural networks (CNNs) are good at extracting local features and parallelizing computation, but they may lose global information and have difficulty handling variable-length sequences. Transformer networks are good at modeling long-range dependencies and self-attention, but they also require large amounts of data and memory. Therefore, depending on the problem domain and the data characteristics, we may choose different types of neural networks for the encoder and the decoder to achieve the best performance. For example, in 4, the authors proposed a hybrid encoder-decoder architecture that combines CNNs and RNNs for image captioning.

(1) Understanding Geometry of Encoder-Decoder CNNs - arXiv.org. https://arxiv.org/pdf/1901.07647.pdf.
(2) Encoder-Decoder Models for Natural Language Processing. https://www.baeldung.com/cs/nlp-encoder-decoder-models.
(3) 10.6. The Encoder–Decoder Architecture — Dive into Deep … https://d2l.ai/chapter_recurrent-modern/encoder-decoder.html.
(4) Demystifying Encoder Decoder Architecture & Neural Network. https://vitalflux.com/encoder-decoder-architecture-neural-network/.

2. Besides machine translation, can you think of another application where the encoder–decoder architecture can be applied?

There are many other applications where the encoder–decoder architecture can be applied, such as:

Text summarization: The encoder can encode a long text into a vector representation, and the decoder can generate a short summary based on the vector. For example, see 2 for a tutorial on how to use encoder-decoder models for text summarization.
Image captioning: The encoder can encode an image into a vector representation, and the decoder can generate a natural language description of the image based on the vector. For example, see [4] for a paper on how to use a hybrid encoder-decoder model for image captioning.
Speech recognition: The encoder can encode an audio signal into a vector representation, and the decoder can generate a transcript of the speech based on the vector. For example, see [5] for a paper on how to use an end-to-end encoder-decoder model for speech recognition.
Music generation: The encoder can encode a musical sequence into a vector representation, and the decoder can generate a new musical sequence based on the vector. For example, see [6] for a paper on how to use an encoder-decoder model with attention for music generation.
(1) Encoder-Decoder Models for Natural Language Processing. https://www.baeldung.com/cs/nlp-encoder-decoder-models.
(2) Encoder and Decoder : Types, Working & Their Applications. https://www.watelectronics.com/different-types-encoder-decoder-applications/.
(3) Encoder and Decoder | Basics & Examples - Electrical Academia. https://electricalacademia.com/digital-circuits/encoder-and-decoder/.

Reference

https://d2l.ai/chapter_recurrent-modern/encoder-decoder.html

菜单

10.6. The Encoder–Decoder Architecture

10.6. The Encoder–Decoder Architecture

1. Suppose that we use neural networks to implement the encoder–decoder architecture. Do the encoder and the decoder have to be the same type of neural network?

2. Besides machine translation, can you think of another application where the encoder–decoder architecture can be applied?

Reference

评论

Different Perspective of Line Regression

15.7. Word Similarity and Analogy

15.5. Word Embedding with Global Vectors (GloVe)

15.6. Subword Embedding

15.4. Pretraining word2vec

15.3. The Dataset for Pretraining Word Embeddings

15.2. Approximate Training

15.1. Word Embedding (word2vec)

解决docker部署的jupyter容器中matplotlib中文乱码

pyspider安装报错解决