花朵识别

基于经典网络架构训练图像分类模型

改模型分为三块

  1. 数据预处理部分:
  • 数据增强:torchvision中transforms模块自带功能,比较实用
  • 数据预处理:torchvision中transforms也帮我们实现好了,直接调用即可
  • DataLoader模块直接读取batch数据
  1. 网络模块设置:
  • 加载预训练模型,torchvision中有很多经典网络架构,调用起来十分方便,并且可以用人家训练好的权重参数来继续训练,也就是所谓的迁移学习
  • 需要注意的是别人训练好的任务跟咱们的可不是完全一样,需要把最后的head层改一改,一般也就是最后的全连接层,改成咱们自己的任务
  • 训练时可以全部重头训练,也可以只训练最后咱们任务的层,因为前几层都是做特征提取的,本质任务目标是一致的
  1. 网络模型保存与测试
  • 模型保存的时候可以带有选择性,例如在验证集中如果当前效果好则保存
  • 读取模型进行实际测试

1

先看一下我们数据的保存格式:

image-20220720100728511

train文件夹中的:

image-20220720100750899

image-20220720100830155

valid文件夹中的:

image-20220720100808296

image-20220720100848997

数据预处理部分

导入模块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 处理数据路径
import os
# 图像展示
import matplotlib.pyplot as plt
%matplotlib inline
# 数据处理
import numpy as np
import torch
from torch import nn
import torch.optim as optim
import torchvision
from torchvision import transforms, models, datasets

import imageio
import time
import warnings
import random
import sys
import copy
import json
from PIL import Image

设置路径

1
2
3
data_dir = "./flower_data/"
train_dir = data_dir + "/train"
valid_dir = data_dir + "/valid"

制作好数据源

  • data_transforms中指定了所有图像预处理操作
  • imageFolder假设所有文件按文件夹保存好,每个文件夹下面存储着同一类别的图片,文件夹的名字为分类的名字

下面进行数据处理,我们先让训练集进行数据增强,就是让一张图像通过一系列变换变为多张图形

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# 数据增强,让图像通过一系列的变换,变成多种多样的图像
data_transforms = {
# 训练集
"train" : transforms.Compose([transforms.RandomRotation(45), # 随机旋转, -45到45度之间随机选
# 从中心开始裁剪,像VGG和RESNET 网络都需要图像大小是224 * 224的
transforms.CenterCrop(224),

# 随机水平旋转,选择一个概率, 就是当我们执行到这一步的时候有50%的概率是执行这一步的
transforms.RandomHorizontalFlip(p = 0.5),
# 随机垂直翻转,选择一个概率
transforms.RandomVerticalFlip(p = 0.5),
# 参数1为亮度,参数2为对比度,参数3为饱和度,参数4为色相
transforms.ColorJitter(brightness=0.2, contrast=0.1, saturation=0.1, hue=0.1),
# 概率转换成灰度率, 3通道就是RGB
transforms.RandomGrayscale(p = 0.025),
# 将图像转换成tensor格式
transforms.ToTensor(),
# 均值,标准差
# 计算方法 数据 = (数据 - 均值) / 标准差
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
# 测试集
# 测试集中没有图像变换的一些操作
"valid" : transforms.Compose([transforms.Resize(256),
# 从中心进行裁剪
transforms. CenterCrop(224),
# 将图像转换成Tensor格式
transforms.ToTensor(),
# 将图像进行均值和标准差,注意这里需要和上面的均值和标准差一样
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}

数据的加载

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 设置batch_size 可以通过DataLoader进行一个模块一个模块的读取
batch_size = 8

# 使用ImageFolder函数进行数据加载
"""
ImageFolder假设所有的文件按文件夹保存,每个文件夹下存储同一个类别的图片,文件夹名为类名,其构造函数如下:
ImageFolder(root, transform=None, target_transform=None, loader=default_loader)


它主要有四个参数:

root:在root指定的路径下寻找图片
transform:对PIL Image进行的转换操作,transform的输入是使用loader读取图片的返回对象
target_transform:对label的转换
loader:给定路径后如何读取图片,默认读取为RGB格式的PIL Image对象

"""
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ["train", "valid"]}

# 使用DataLoader函数来进行分批 计算
dataloaders = {x : torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True) for x in ["train", "valid"]}

# 获取每一个数据集的个数
dataset_sizes = {x : len(image_datasets[x]) for x in ["train", "valid"]}

# 获取训练集的名称
class_names = image_datasets["train"].classes

看一下处理完之后的结果

  • image_datasets:orange
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{'train': Dataset ImageFolder
Number of datapoints: 6552
Root location: ./flower_data/train
StandardTransform
Transform: Compose(
RandomRotation(degrees=[-45.0, 45.0], interpolation=nearest, expand=False, fill=0)
CenterCrop(size=(224, 224))
RandomHorizontalFlip(p=0.5)
RandomVerticalFlip(p=0.5)
ColorJitter(brightness=[0.8, 1.2], contrast=[0.9, 1.1], saturation=[0.9, 1.1], hue=[-0.1, 0.1])
RandomGrayscale(p=0.025)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
),
'valid': Dataset ImageFolder
Number of datapoints: 818
Root location: ./flower_data/valid
StandardTransform
Transform: Compose(
Resize(size=256, interpolation=bilinear, max_size=None, antialias=None)
CenterCrop(size=(224, 224))
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)}
  • dataloaders:
1
2
{'train': <torch.utils.data.dataloader.DataLoader at 0x1706fadfdf0>,
'valid': <torch.utils.data.dataloader.DataLoader at 0x1706fadffa0>}
  • dataset_sizes
1
{'train': 6552, 'valid': 818}
  • class_names :每个文件夹的名称,看做这个类别的名称
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
['1',
'10',
'100',
'101',
'102',
'11',
'12',
'13',
'14',
'15',
'16',
'17',
'18',
'19',
'2',
'20',
'21',
'22',
'23',
'24',
'25',
'26',
'27',
'28',
'29',
'3',
'30',
'31',
'32',
'33',
'34',
'35',
'36',
'37',
'38',
'39',
'4',
'40',
'41',
'42',
'43',
'44',
'45',
'46',
'47',
'48',
'49',
'5',
'50',
'51',
'52',
'53',
'54',
'55',
'56',
'57',
'58',
'59',
'6',
'60',
'61',
'62',
'63',
'64',
'65',
'66',
'67',
'68',
'69',
'7',
'70',
'71',
'72',
'73',
'74',
'75',
'76',
'77',
'78',
'79',
'8',
'80',
'81',
'82',
'83',
'84',
'85',
'86',
'87',
'88',
'89',
'9',
'90',
'91',
'92',
'93',
'94',
'95',
'96',
'97',
'98',
'99']

当然,我们看这些文件夹名称来分类的 话有一些不方便,所以我们可以给他对应起来:

1
2
3
4
with open("flower_data/cat_to_name.json", "r") as f:
cat_to_name = json.load(f)
# 输出
print(cat_to_name)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
{'21': 'fire lily',
'3': 'canterbury bells',
'45': 'bolero deep blue',
'1': 'pink primrose',
'34': 'mexican aster',
'27': 'prince of wales feathers',
'7': 'moon orchid',
'16': 'globe-flower',
'25': 'grape hyacinth',
'26': 'corn poppy',
'79': 'toad lily',
'39': 'siam tulip',
'24': 'red ginger',
'67': 'spring crocus',
'35': 'alpine sea holly',
'32': 'garden phlox',
'10': 'globe thistle',
'6': 'tiger lily',
'93': 'ball moss',
'33': 'love in the mist',
'9': 'monkshood',
'102': 'blackberry lily',
'14': 'spear thistle',
'19': 'balloon flower',
'100': 'blanket flower',
'13': 'king protea',
'49': 'oxeye daisy',
'15': 'yellow iris',
'61': 'cautleya spicata',
'31': 'carnation',
'64': 'silverbush',
'68': 'bearded iris',
'63': 'black-eyed susan',
'69': 'windflower',
'62': 'japanese anemone',
'20': 'giant white arum lily',
'38': 'great masterwort',
'4': 'sweet pea',
'86': 'tree mallow',
'101': 'trumpet creeper',
'42': 'daffodil',
'22': 'pincushion flower',
'2': 'hard-leaved pocket orchid',
'54': 'sunflower',
'66': 'osteospermum',
'70': 'tree poppy',
'85': 'desert-rose',
'99': 'bromelia',
'87': 'magnolia',
'5': 'english marigold',
'92': 'bee balm',
'28': 'stemless gentian',
'97': 'mallow',
'57': 'gaura',
'40': 'lenten rose',
'47': 'marigold',
'59': 'orange dahlia',
'48': 'buttercup',
'55': 'pelargonium',
'36': 'ruby-lipped cattleya',
'91': 'hippeastrum',
'29': 'artichoke',
'71': 'gazania',
'90': 'canna lily',
'18': 'peruvian lily',
'98': 'mexican petunia',
'8': 'bird of paradise',
'30': 'sweet william',
'17': 'purple coneflower',
'52': 'wild pansy',
'84': 'columbine',
'12': "colt's foot",
'11': 'snapdragon',
'96': 'camellia',
'23': 'fritillary',
'50': 'common dandelion',
'44': 'poinsettia',
'53': 'primula',
'72': 'azalea',
'65': 'californian poppy',
'80': 'anthurium',
'76': 'morning glory',
'37': 'cape flower',
'56': 'bishop of llandaff',
'60': 'pink-yellow dahlia',
'82': 'clematis',
'58': 'geranium',
'75': 'thorn apple',
'41': 'barbeton daisy',
'95': 'bougainvillea',
'43': 'sword lily',
'83': 'hibiscus',
'78': 'lotus lotus',
'88': 'cyclamen',
'94': 'foxglove',
'81': 'frangipani',
'74': 'rose',
'89': 'watercress',
'73': 'water lily',
'46': 'wallflower',
'77': 'passion flower',
'51': 'petunia'}

这样每一个数据都有了对应的名字

数据的展示

  • 注意tensor的数据需要转换成numpy格式,而且还需要返还标准化的结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def im_convert(tensor):
"""展示数据"""
# 复制传入的数据
image = tensor.to("cpu").clone().detach()
# 将数组中维度为1的去掉
"""
通常算法的结果是可以表示向量的数组(即包含两对或以上的方括号形式[[]]),如果直接利用这个数组进行画图可能显示界面为空(见后面的示例)。我们可以利用squeeze()函数将表示向量的数组转换为秩为1的数组,这样利用matplotlib库函数画图时,就可以正常的显示结果了。
"""
image = image.numpy().squeeze()

# 在torch中的颜色通道是 C*H*W, 但是我们想要展示的图像的颜色通道是H*W*C的,所以需要换一下位置
image = image.transpose(1,2,0)

# 因为我们的数据在之前进行了标准化了,所以这时候我们需要还原本来的数据,
# 原来的数据计算方法 = (数据-均值)/标准差, 入股我们想要还原数据的话,则需要进行 数据*标准差 + 均值
image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406))
# 将数据中小于0的设置为0, 大于1的设置为1
image = image.clip(0, 1)

# 返回图像数据
return image

# 设置画布大小
fig = plt.figure(figsize = (20, 12))
# 因为我们是按照一个batch来进行遍历的,所以我们设置为两行四列
columns = 4
rows = 2

# iter返回一个迭代对象
dataiter = iter(dataloaders["valid"])
# .next()获取迭代对象中的一个数据
inputs, classes = dataiter.next()

# 一共有8个数据
for idx in range(columns * rows):
# 设置数据的位置
ax = fig.add_subplot(rows, columns, idx+1, xticks=[], yticks=[])
# 设置每一个图像的名称
ax.set_title(cat_to_name[str(int(class_names[classes[idx]]))])
# 展示图像
plt.imshow(im_convert(inputs[idx]))
plt.show()

image-20220720110049923

网络模块设置

  • 第一次执行需要下载

模型的选择

1
2
3
4
model_name = "resnet"  # 选择对应的神经网络 ["resnet", "alexnet", "vgg", "squeezenet", "densenet", "inception"]

# 是否用人家训练好的特征来做
feature_extract = True

GPU or CPU

1
2
train_on_gpu = torch.cuda.is_available()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

迁移学习

1
2
3
4
5
6
7
# 迁移学习 是否使用人家的权重参数
def set_parameter_requires_grad(model, feature_extracting):
# 如果我们使用人家训练好的特征的话,我们就需要 不让参数进行梯度更新
if feature_extracting:
for param in model.parameters():
# 不进行梯度更新
param.requires_grad = False

初始化模型并展示

1
2
3
model_ft = models.resnet152()

print(model_ft)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
*
*
*
(layer4): Sequential(
(0): Bottleneck(
(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
# 全连接层
(fc): Linear(in_features=2048, out_features=1000, bias=True)
)

从这个模型的全连接层来看,我们发现,人家这个是做1000分类的,而我们只需要做102分类,所以我们需要改变一下他这个全连接层,从而满足我们的需求

初始化模型

根据我们之前找到的问题,我们需要将1000分类转换成102分类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
def initialize_model(model_name, num_classes, feature_extract, use_pretrained=True):
# 选择合适的模型,不同的模型初始化的方法稍微有点区别
model_ft = None
input_size = 0

if model_name == "resnet":
'''
Resnet152
'''
# 加载模型,并将其模型下载下来
model_ft = models.resnet152(pretrained = use_pretrained)
# 有选择的冻住某些卷积层,就是那些卷积层我们不做改变
set_parameter_requires_grad(model_ft, feature_extract)
# 得到模型最后的特征数量
num_ftrs = model_ft.fc.in_features
# 改变全连接层让其输出分类数量和我们想要的数量一致
# 这里就是我们所要转换为102分类
model_ft.fc = nn.Sequential(nn.Linear(num_ftrs, 102),
nn.LogSoftmax(dim=1))
# VGG 和 RESNET网络输出层数都需要是224*224格式的
input_size = 224
elif model_name == "alexnet":
""" Alexnet
"""
model_ft = models.alexnet(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier[6].in_features
model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
input_size = 224

elif model_name == "vgg":
""" VGG11_bn
"""
model_ft = models.vgg16(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier[6].in_features
model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
input_size = 224

elif model_name == "squeezenet":
""" Squeezenet
"""
model_ft = models.squeezenet1_0(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size=(1,1), stride=(1,1))
model_ft.num_classes = num_classes
input_size = 224

elif model_name == "densenet":
""" Densenet
"""
model_ft = models.densenet121(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
num_ftrs = model_ft.classifier.in_features
model_ft.classifier = nn.Linear(num_ftrs, num_classes)
input_size = 224

elif model_name == "inception":
""" Inception v3
Be careful, expects (299,299) sized images and has auxiliary output
"""
model_ft = models.inception_v3(pretrained=use_pretrained)
set_parameter_requires_grad(model_ft, feature_extract)
# Handle the auxilary net
num_ftrs = model_ft.AuxLogits.fc.in_features
model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
# Handle the primary net
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs,num_classes)
input_size = 299

else:
print("Invalid model name, exiting...")
exit()

return model_ft, input_size

设置需要训练的层

因为前面的卷积层,人家都训练好了,我们只需要训练最后的全连接层,来满足我们的需求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
model_ft, input_size = initialize_model(model_name, 102, feature_extract, use_pretrained=True)

# GPU计算
model_ft = model_ft.to(device)

# 模型保存
filename = "checkpoint.pth"

# 是否训练所有层
params_to_update = model_ft.parameters()
print("Params to Learn:")
if feature_extract:
params_to_update = []
for name, param in model_ft.named_parameters():
if param.requires_grad == True:
params_to_update.append(param)
print("\t", name)
else:
for name, param in model_ft.named_parameters():
if param.requires_grad == True:
print("\t", name)

输出结果:我们可以看出,我们是使用的人家训练好的网络,所以我们所需要训练的只有全连接层的权重参数和偏置参数

1
2
3
Params to Learn:
fc.0.weight
fc.0.bias

优化器设置

1
2
3
4
5
6
7
8
# 优化器设置
optimizer_ft = optim.Adam(params_to_update, lr = 1e-2)

# 动态变换学习率
scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1) # 学习率每7个epoch衰减成原来的1/10
# 因为我们在模型设置最后的全连接层中使用了LogSoftmax()了,所以不能使用nn.CrossEntropyLoss()来计算了, nn.CrossEntropyLoss()相当于
# LogSoftmax() 和 nn.NLLLoss()整合
criterion = nn.NLLLoss()

训练模块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
def train_model(model,dataloaders, criterion,optimizer, num_epochs=25, is_inception=False, filename=filename):
since = time.time()
# 保存最好的准确率
best_acc = 0
# 将模型部署到device中
model.to(device)

# 准确率和损失值
# 实际准确率
val_acc_history = []
# 训练准确率
train_acc_history = []
# 训练损失值
train_losses = []
# 实际损失值
valid_losses = []
# 学习率
LRs = [optimizer.param_groups[0]['lr']]

# 把最好的那一组的权重参数提取出来
best_model_wts = copy.deepcopy(model.state_dict())

# 开始训练
for epoch in range(num_epochs):
# 打印表头
print("Epoch {}/{}".format(epoch, num_epochs-1))
print('-'*10)

# 训练和验证
for phase in ["train", "valid"]:
# 根据训练集还是测试集,我们分别进行训练集初始化和测试集初始化
if phase == "train":
model.train() # 训练
else:
model.eval() # 验证

# 损失值
running_loss = 0.0
# 正确率
running_corrects = 0

# 把数据都取个遍
for inputs, labels in dataloaders[phase]:
inputs = inputs.to(device)
labels = labels.to(device)

# 清零
optimizer.zero_grad()
# 只有训练的时候才进行计算和更新梯度
with torch.set_grad_enabled(phase == "train"):
if is_inception and phase =="train":
outputs, aux_outputs = model(inputs)
loss1 = criterion(outputs, labels)
loss2 = criterion(aux_outputs, labels)
loss = loss1 + 0.4*loss2
else: # resnet从这开始执行
outputs = model(inputs)
# 通过损失函数得出损失值
loss = criterion(outputs, labels)
# 将预测值中概率最大的提取出来
_, preds = torch.max(outputs, 1)

# 训练阶段更新权重
if phase == "train":
loss.backward()
optimizer.step()

# 计算损失
running_loss += loss.item()*inputs.size(0)
running_corrects += torch.sum(preds == labels.data)

epoch_loss = running_loss / len(dataloaders[phase].dataset)
epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

time_elapsed = time.time() - since
print("Time elapsed {:.0f}m {:0.f}s".format(time_elapsed//60, time_elapsed%60))
print("{} Loss {:.4f} Acc: {:.4f}". format(phase, epoch_loss, epoch_acc))

# 得到最好那一次的模型
# 如果这次是验证集,并且这次的准确率比最好的准去率还高
if phase == "valid" and epoch_acc > best_acc:
best_acc = epoch_acc
# 把当前最好的哪一种的权重参数复制过来
best_model_wts = copy.deepcopy(model.state_dict())
state = {
# 模型参数
"state_dict": model.state_dict(),
# 模型效果
"best_acc" : best_acc,
# 优化器
"optimizer" : optimizer.state_dict(),
}
torch.save(state, filename)
if phase == "valid":
val_acc_history.append(epoch_acc)
valid_losses.append(epoch_loss)
scheduler.step(epoch_loss)
if phase == "train":
train_acc_history.append(epoch_acc)
train_losses.append(epoch_loss)
print("Optimizer learning rate : {:.7f}".format(optimizer.param_groups[0]["lr"]))
LRs.append(optimizer.param_groups[0]["lr"])
print()

time_elapsed = time.time() - since
print('Training complete in {:.Of]m {:.Of} s'.format(time_elapsed // 60, time_elapsed % 60))
print (' Best val Acc: {:4f} '.format(best_acc))
#训练完后用最好的一次当做模型最终的结果
model.load_state_dict (best_model_wts)
return model,val_acc_history,train_acc_history,valid_losses,train_losses,LRs

最后结果

1
model_ft, val_acc_history, train_acc_history, valid_losses, train_losses,LRs = train_model(model_ft, dataloaders, criterion,optimizer_ft, num_epochs=20, is_inception=(model_name=="inception"))