小白学Pytorch系列–Torch.nn API Transformer Layers(9)
![](https://img-blog.csdnimg.cn/9cd5b88591024cc38c044fc02a88ec9c.png)
方法注释nn.TransformerTransformer模型。nn.TransformerEncoderTransformerEncoder是N个编码器层的堆栈。nn.TransformerDecoderTransformerDecoder是N个解码器层的堆栈nn.TransformerEncoderLayerTransformerEncoderLayer 由自注意网络和前馈网络组成。nn.TransformerDecoderLayerTransformerDecoderLayer由自注意网络、多头注意网络和前馈网络组成。
解读参考: https://blog.csdn.net/qq_43645301/article/details/109279616
nn.Transformer
![](https://img-blog.csdnimg.cn/405fdc6cd05e4127a258c97424b4fa16.png)
>>> transformer_model = nn.Transformer(nhead=16, num_encoder_layers=12)
>>> src = torch.rand((10, 32, 512))
>>> tgt = torch.rand((20, 32, 512))
>>> out = transformer_model(src, tgt)
output = transformer_model(src, tgt, src_mask=src_mask, tgt_mask=tgt_mask)
nn.TransformerEncoder
TransformerEncoder是一个由N个编码器层组成的堆栈。用户可以构建BERT(https://arxiv.org/abs/1810.04805)具有相应参数的模型。 ![](https://img-blog.csdnimg.cn/1eeaef701adf4175a07e269c6f1293de.png)
>>> encoder_layer = nn.TransformerEncoderLayer(d_model=512, nhead=8)
>>> transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=6)
>>> src = torch.rand(10, 32, 512)
>>> out = transformer_encoder(src)
![](https://img-blog.csdnimg.cn/3aa709ca0ed245c98e6a0e536d48ce27.png)
nn.TransformerDecoder
![](https://img-blog.csdnimg.cn/da8c27a5f35c437cbff9b7420e5050c6.png)
>>> decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)
>>> transformer_decoder = nn.TransformerDecoder(decoder_layer, num_layers=6)
>>> memory = torch.rand(10, 32, 512)
>>> tgt = torch.rand(20, 32, 512)
>>> out = transformer_decoder(tgt, memory)
nn.TransformerEncoderLayer
TransformerEncoderLayer由自注意网络和前馈网络组成。这个标准编码器层是基于“Attention Is All You Need”,用户可以在应用过程中以不同的方式修改或实现。 ![](https://img-blog.csdnimg.cn/9e65ee52099d4bbcbfa3ad0d6b0c5bcd.png)
>>> encoder_layer = nn.TransformerEncoderLayer(d_model=512, nhead=8, batch_first=True)
>>> src = torch.rand(32, 10, 512)
>>> out = encoder_layer(src)
>>> encoder_layer = nn.TransformerEncoderLayer(d_model=512, nhead=8, batch_first=True)
>>> src = torch.rand(32, 10, 512)
>>> out = encoder_layer(src)
nn.TransformerDecoderLayer
![](https://img-blog.csdnimg.cn/cc7538b75d964752a60402e29cee85a8.png)
>>> decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8)
>>> memory = torch.rand(10, 32, 512)
>>> tgt = torch.rand(20, 32, 512)
>>> out = decoder_layer(tgt, memory)
>>> decoder_layer = nn.TransformerDecoderLayer(d_model=512, nhead=8, batch_first=True)
>>> memory = torch.rand(32, 10, 512)
>>> tgt = torch.rand(32, 20, 512)
>>> out = decoder_layer(tgt, memory)
|