Different Perspective of Line Regression

1. Line regression 1.1 Ordinary Least Squares (OLS) Perspective 1.1.1 Model Representation: The linear regression model is represented as: \hat{Y} = X

narcissuskid narcissuskid 发布于 2024-03-18

11.6. Self-Attention and Positional Encoding

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Suppose that we design a deep architecture to represent a sequence by stacking self-a

narcissuskid narcissuskid 发布于 2023-09-11

11.5. Multi-Head Attention

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Visualize attention weights of multiple heads in this experiment. import sys import t

narcissuskid narcissuskid 发布于 2023-09-10

11.4. The Bahdanau Attention Mechanism

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Replace GRU with LSTM in the experiment. import sys import torch.nn as nn import torc

narcissuskid narcissuskid 发布于 2023-09-10

11.3. Attention Scoring Functions

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Implement distance-based attention by modifying the DotProductAttention code. Note th

narcissuskid narcissuskid 发布于 2023-09-10

11.2. Attention Pooling by Similarity

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Parzen windows density estimates are given by \hat{p}(x)=\frac{1}{n}\sum_ik(x,x_i)

narcissuskid narcissuskid 发布于 2023-09-09

11.1. Queries, Keys, and Values

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings from sklearn.model_selecti

narcissuskid narcissuskid 发布于 2023-09-09

10.7. Sequence-to-Sequence Learning for Machine Translation

github: https://github.com/pandalabme/d2l/tree/main/exercises import sys import torch.nn as nn import torch import warnings import numpy as np from sk

narcissuskid narcissuskid 发布于 2023-09-07

10.8. Beam Search

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Can we treat exhaustive search as a special type of beam search? Why or why not? We c

narcissuskid narcissuskid 发布于 2023-09-07

10.6. The Encoder–Decoder Architecture

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Suppose that we use neural networks to implement the encoder–decoder architecture. Do

narcissuskid narcissuskid 发布于 2023-09-06

10.5. Machine Translation and the Dataset

github: https://github.com/pandalabme/d2l/tree/main/exercises 1. Try different values of the max_examples argument in the _tokenize method. How does t

narcissuskid narcissuskid 发布于 2023-09-06