11.9. Large-Scale Pretraining with Transformers

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Is it possible to fine-tune T5 using a minibatch consisting of different tasks? Why o

narcissuskid 发布于 2023-09-11

Deep Learning #solutions #d2l #pytorch

11.8. Transformers for Vision

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. How does the value of img_size affect training time? The value of img_size affects th

narcissuskid 发布于 2023-09-11

Deep Learning #solutions #d2l #pytorch

11.7. The Transformer Architecture

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Train a deeper Transformer in the experiments. How does it affect the training speed

narcissuskid 发布于 2023-09-11

Deep Learning #solutions #d2l #pytorch

11.6. Self-Attention and Positional Encoding

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Suppose that we design a deep architecture to represent a sequence by stacking self-a

narcissuskid 发布于 2023-09-11

Deep Learning #solutions #d2l #pytorch

11.5. Multi-Head Attention

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Visualize attention weights of multiple heads in this experiment. import sys import t

narcissuskid 发布于 2023-09-10

Deep Learning #solutions #d2l #pytorch

11.4. The Bahdanau Attention Mechanism

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Replace GRU with LSTM in the experiment. import sys import torch.nn as nn import torc

narcissuskid 发布于 2023-09-10

Deep Learning #solutions #d2l #pytorch

11.3. Attention Scoring Functions

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Implement distance-based attention by modifying the DotProductAttention code. Note th

narcissuskid 发布于 2023-09-10

Deep Learning #solutions #d2l #pytorch

11.2. Attention Pooling by Similarity

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Parzen windows density estimates are given by \hat{p}(x)=\frac{1}{n}\sum_ik(x,x_i)

narcissuskid 发布于 2023-09-09

Deep Learning #solutions #d2l #pytorch

11.1. Queries, Keys, and Values

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings from sklearn.model_selecti

narcissuskid 发布于 2023-09-09

Deep Learning #solutions #d2l #pytorch

10.7. Sequence-to-Sequence Learning for Machine Translation

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings import numpy as np from sk

narcissuskid 发布于 2023-09-07

菜单

11.9. Large-Scale Pretraining with Transformers

11.8. Transformers for Vision

11.7. The Transformer Architecture

11.6. Self-Attention and Positional Encoding

11.5. Multi-Head Attention

11.4. The Bahdanau Attention Mechanism

11.3. Attention Scoring Functions

11.2. Attention Pooling by Similarity

11.1. Queries, Keys, and Values

10.7. Sequence-to-Sequence Learning for Machine Translation

Different Perspective of Line Regression

15.7. Word Similarity and Analogy

15.5. Word Embedding with Global Vectors (GloVe)

15.6. Subword Embedding

15.4. Pretraining word2vec

15.3. The Dataset for Pretraining Word Embeddings

15.2. Approximate Training

15.1. Word Embedding (word2vec)

解决docker部署的jupyter容器中matplotlib中文乱码

pyspider安装报错解决