15.7. Word Similarity and Analogy

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Test the fastText results using TokenEmbedding(‘wiki.en’). import os import torch fro

narcissuskid narcissuskid 发布于 2023-11-30

11.8. Transformers for Vision

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. How does the value of img_size affect training time? The value of img_size affects th

narcissuskid narcissuskid 发布于 2023-09-11

11.7. The Transformer Architecture

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Train a deeper Transformer in the experiments. How does it affect the training speed

narcissuskid narcissuskid 发布于 2023-09-11

11.6. Self-Attention and Positional Encoding

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Suppose that we design a deep architecture to represent a sequence by stacking self-a

narcissuskid narcissuskid 发布于 2023-09-11

11.5. Multi-Head Attention

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Visualize attention weights of multiple heads in this experiment. import sys import t

narcissuskid narcissuskid 发布于 2023-09-10

11.4. The Bahdanau Attention Mechanism

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Replace GRU with LSTM in the experiment. import sys import torch.nn as nn import torc

narcissuskid narcissuskid 发布于 2023-09-10

11.3. Attention Scoring Functions

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Implement distance-based attention by modifying the DotProductAttention code. Note th

narcissuskid narcissuskid 发布于 2023-09-10

11.2. Attention Pooling by Similarity

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Parzen windows density estimates are given by \hat{p}(x)=\frac{1}{n}\sum_ik(x,x_i)

narcissuskid narcissuskid 发布于 2023-09-09

11.1. Queries, Keys, and Values

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings from sklearn.model_selecti

narcissuskid narcissuskid 发布于 2023-09-09

10.7. Sequence-to-Sequence Learning for Machine Translation

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings import numpy as np from sk

narcissuskid narcissuskid 发布于 2023-09-07

10.8. Beam Search

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Can we treat exhaustive search as a special type of beam search? Why or why not? We c

narcissuskid narcissuskid 发布于 2023-09-07