pytorch 更新部分参数（冻结参数）注意事项

2024-02-10 18:06| 来源: 网络整理| 查看: 265

实验的pytorch版本1.2.0

在训练过程中可能需要固定一部分模型的参数，只更新另一部分参数。有两种思路实现这个目标，一个是设置不要更新参数的网络层为false，另一个就是在定义优化器时只传入要更新的参数。当然最优的做法是，优化器中只传入requires_grad=True的参数，这样占用的内存会更小一点，效率也会更高。

一、设置参数为false import torch import torch.nn as nn import torch.optim as optim # 定义一个简单的网络 class net(nn.Module): def __init__(self, num_class=10): super(net, self).__init__() self.fc1 = nn.Linear(8, 4) self.fc2 = nn.Linear(4, num_class) def forward(self, x): return self.fc2(self.fc1(x)) model = net() # 冻结fc1层的参数 for name, param in model.named_parameters(): if "fc1" in name: param.requires_grad = False loss_fn = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=1e-2) # 传入的是所有的参数 print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight) for epoch in range(10): x = torch.randn((3, 8)) label = torch.randint(0,10,[3]).long() output = model(x) loss = loss_fn(output, label) optimizer.zero_grad() loss.backward() optimizer.step() print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight)

由实验的结果可以看出：只要设置requires_grad=False虽然传入模型所有的参数，仍然只更新requires_grad=True的。

二、直接传入要更新的参数 # 定义一个简单的网络 class net(nn.Module): def __init__(self, num_class=3): super(net, self).__init__() self.fc1 = nn.Linear(8, 4) self.fc2 = nn.Linear(4, num_class) def forward(self, x): return self.fc2(self.fc1(x)) model = net() # 冻结fc1层的参数 # for name, param in model.named_parameters(): # if "fc1" in name: # param.requires_grad = False loss_fn = nn.CrossEntropyLoss() optimizer = optim.SGD(model.fc2.parameters(), lr=1e-2) # 只传入fc2的参数 print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight) for epoch in range(10): x = torch.randn((3, 8)) label = torch.randint(0,3,[3]).long() output = model(x) loss = loss_fn(output, label) optimizer.zero_grad() loss.backward() optimizer.step() print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight) print()

可以看出：只会更新优化器传入的参数，对于没有传入的参数虽然可以求导，但是仍然不会更新参数。

三、最优写法：

就是将上面两种结合起来，不更新的参数设置为False同时不传入。

# 定义一个简单的网络 class net(nn.Module): def __init__(self, num_class=3): super(net, self).__init__() self.fc1 = nn.Linear(8, 4) self.fc2 = nn.Linear(4, num_class) def forward(self, x): return self.fc2(self.fc1(x)) model = net() # 冻结fc1层的参数 for name, param in model.named_parameters(): if "fc1" in name: param.requires_grad = False loss_fn = nn.CrossEntropyLoss() optimizer = optim.SGD(model.fc2.parameters(), lr=1e-2) print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight) for epoch in range(10): x = torch.randn((3, 8)) label = torch.randint(0,3,[3]).long() output = model(x) loss = loss_fn(output, label) optimizer.zero_grad() loss.backward() optimizer.step() print("model.fc1.weight", model.fc1.weight) print("model.fc2.weight", model.fc2.weight) print()

【本文地址】

公司简介

联系我们