Pytorch:计算图像数据集的均值和标准差
在使⽤ ansforms进⾏数据处理时我们经常进⾏的操作是:
transforms.Normalize((0.485,0.456,0.406), (0.229,0.224,0.225))
前⾯的(0.485,0.456,0.406)表⽰均值,分别对应的是RGB三个通道;后⾯的(0.229,0.224,0.225)则表⽰的是标准差
这上⾯的均值和标准差的值是ImageNet数据集计算出来的,所以很多⼈都使⽤它们
但是如果你想要计算⾃⼰的数据集的均值和标准差,让其作为你的transforms.Normalize函数的参数的话可以进⾏下⾯的操作代码get_mean_std.py:
# coding:utf-8
import os
import numpy as np
from torchvision.datasets import ImageFolder
ansforms as transforms
from dataloader import Dataloader
from options import options
import pickle
"""
在训练前先运⾏该函数获得数据的均值和标准差
"""
class Dataloader():
def __init__(self, opt):
# 训练,验证,测试数据集⽂件夹名
self.opt = opt
self.dirs = ['train', 'test', 'testing']
self.stdevs = [0, 0, 0]
transforms.CenterCrop(opt.isize),
transforms.ToTensor(),#数据值从[0,255]范围转为[0,1],相当于除以255操作
# transforms.Normalize((0.485,0.456,0.406), (0.229,0.224,0.225))
])
# 因为这⾥使⽤的是ImageFolder,按⽂件夹给数据分类,⼀个⽂件夹为⼀类,label会⾃动标注好
self.dataset = {x: ImageFolder(os.path.join(opt.dataroot, x), ansform) for x in self.dirs}
def get_mean_std(self, type, mean_std_path):import pickle
"""
计算数据集的均值和标准差
:param type: 使⽤的是那个数据集的数据,有'train', 'test', 'testing'
:param mean_std_path: 计算出来的均值和标准差存储的⽂件
:return:
"""
num_imgs = len(self.dataset[type])
for data in self.dataset[type]:
img = data[0]
for i in range(3):
# ⼀个通道的均值和标准差
self.stdevs[i] += img[i, :, :].std()
self.stdevs = np.asarray(self.stdevs) / num_imgs
print("{} : normMean = {}".format(type, ans))
print("{} : normstdevs = {}".format(type, self.stdevs))
# 将得到的均值和标准差写到⽂件中,之后就能够从中读取
with open(mean_std_path, 'wb') as f:
pickle.ans, f)
pickle.dump(self.stdevs, f)
print('pickle done')
if __name__ == '__main__':
opt = options().parse()
dataloader = Dataloader(opt)
for x in dataloader.dirs:
mean_std_path = 'mean_std_value_' + x + '.pkl'
<_mean_std(x, mean_std_path)
然后再从相应的⽂件读取均值和标准差放到dataloader.py的transforms.Normalize函数中即可:
# coding:utf-8
import os
import torch
ansforms as transforms
from torchvision.datasets import ImageFolder
import numpy as np
import pickle
"""
⽤于加载训练train、验证test和测试数据testing
"""
class Dataloader():
def __init__(self, opt):
# 训练,验证,测试数据集⽂件夹名
self.opt = opt
self.dirs = ['train', 'test', 'testing']
# 均值和标准差存储的⽂件路径
# 初始化为0
self.stdevs = {x: [0, 0, 0] for x in self.dirs}
print(ans['train']))
ans)
print(self.stdevs)
for x in self.dirs:
#如果存在则说明之前有获取过均值和标准差
if an_std_path[x]):
with an_std_path[x], 'rb') as f:
self.stdevs[x] = pickle.load(f)
print('pickle load done')
ans)
print(self.stdevs)
# 将相应的均值和标准差设置到transforms.Normalize函数中
transforms.ToTensor(),
transforms.ans[x], self.stdevs[x]),                                        ]) for x in self.dirs}
...

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。