使⽤pytorch对⼀维数据实现简单多分类
Pytorch 是⽬前最好⽤的神经⽹络库之⼀,最近我写了⼀个pytorch的简单代码,在这⾥对其做⼀个全⾯的介绍。
在pytorch 中⼀些常⽤的功能都已经被封装成了模块,所以我们只需要继承并重写部分函数即可。⾸先介绍⼀下本⽂最终希望实现的⽬标,对本地的⼀维数据(1xn)的ndarry 进⾏⼀个多分类,数据集为m n的数据,标签为m1的数组。下⾯是结合代码记录⼀下踩坑过程。
1. 继承Dataset类,可以看到我这⾥重写了三个函数,init 函数⽤于载⼊numpy数据并将其转化为相应的tensor,__gititem__函数⽤于定
义训练时会返回的单个数据与标签,__len__表⽰数据数量m。
class MyDataSet(torch.utils.data.Dataset):
def __init__(self, data, label):
self.data = torch.from_numpy(data).float()
self.label = torch.from_numpy(label)
self.length = label.shape[0]
def __getitem__(self, index):
return self.data[index], self.label[index]
def __len__(self):
return self.length
2. 通过继承nn.Module来⾃定义神经⽹络
其中__init__函数来⾃定义定义我们需要的⽹络参数,这⾥我们block1 的in_channels为1,输出参数可根据需要⾃⼰设定,但⽽且当前层的输出channel应该和下⼀层的输⼊channel相同,
注意:MaxPool1d的inchannel需要⾃⼰计算⼀下,当然如果你不想算,可以给个参数直接运⾏,看报错信息的提⽰
__forward__ 函数定义了⽹络的连接⽅式,注意此处应返回x。
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.block1 = nn.Sequential(
nn.Conv1d(in_channels=filter_num[0], out_channels=filter_num[1], padding='same', stride=conv_stride_size,
kernel_size=conv_kernel_size, dtype=dtype),
nn.BatchNorm1d(filter_num[1],dtype=dtype),
nn.ReLU(),
nn.Conv1d(filter_num[1], filter_num[1], kernel_size=conv_kernel_size, stride=conv_stride_size, padding='same',dtype=dtype),
nn.BatchNorm1d(filter_num[1],dtype=dtype),
nn.ReLU(),
nn.MaxPool1d(kernel_size=pool_kernel_size, stride=pool_stride, padding=pool_padding),
nn.Dropout(drop_float),
)
self.block2=nn.Sequential(
nn.Conv1d(in_channels=filter_num[1],out_channels=filter_num[2], padding='same', stride=conv_stride_size,
kernel_size=conv_kernel_size, dtype=dtype),
nn.BatchNorm1d(num_features=filter_num[2],dtype=dtype),
nn.ReLU(),
nn.Conv1d(filter_num[2],filter_num[2],kernel_size=conv_kernel_size,stride=conv_stride_size,
padding='same',dtype=dtype),
nn.BatchNorm1d(filter_num[2],dtype=dtype),
nn.ReLU(),
nn.MaxPool1d(kernel_size=pool_kernel_size, stride=pool_stride,padding=pool_padding),
nn.Dropout(drop_float),
)
self.block3=nn.Sequential(
nn.Conv1d(in_channels=filter_num[2],out_channels=filter_num[3], padding='same', stride=conv_stride_size,
kernel_size=conv_kernel_size, dtype=dtype),
nn.BatchNorm1d(num_features=filter_num[3],dtype=dtype),
nn.ReLU(),
nn.Conv1d(filter_num[3],filter_num[3],kernel_size=conv_kernel_size,stride=conv_stride_size,
padding='same',dtype=dtype),
nn.BatchNorm1d(filter_num[3],dtype=dtype),
nn.ReLU(),
nn.MaxPool1d(kernel_size=pool_kernel_size, stride=pool_stride,padding=pool_padding),
nn.Dropout(drop_float),
)
self.block4=nn.Sequential(
nn.Conv1d(in_channels=filter_num[3],out_channels=filter_num[4], padding='same', stride=conv_stride_size,
kernel_size=conv_kernel_size, dtype=dtype),
nn.BatchNorm1d(num_features=filter_num[4],dtype=dtype),
nn.ReLU(),
nn.Conv1d(filter_num[4],filter_num[4],kernel_size=conv_kernel_size,stride=conv_stride_size,
padding='same',dtype=dtype),
nn.BatchNorm1d(filter_num[4],dtype=dtype),
nn.ReLU(),
nn.MaxPool1d(kernel_size=pool_kernel_size, stride=pool_stride,padding=pool_padding),
nn.Dropout(drop_float),
)
nn.Flatten(),
nn.BatchNorm1d(num_features=1280 ,dtype=dtype),
nn.ReLU(),
nn.Dropout(0.7),
nn.Flatten(),
nn.BatchNorm1d(num_features=1280 ,dtype=dtype),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(1280, 180,dtype=dtype)
)
def forward(self, x):
x = self.block1(x)
x= self.block2(x)
x= self.block3(x)
x= self.block4(x)
x = ted(x)
return x
3. 主程序。为了更好的说明,先放⼀下主程序。这⾥的程序是已经载⼊了数据的,data是m n 数组,label为m1数组。
实例化DataLoader的第⼀个参数是Dataset的实例,通过DataLoader,其功能是为下⽂训练和测试过程提供数据。
print("data loading ...")
train_data_set = MyDataSet(trainingDataLoadProcess.data, trainingDataLoadProcess.label)
test_data_set = MyDataSet(testDataLoadProcess.data, testDataLoadProcess.label)
train_dataloader = DataLoader(train_data_set, batch_size=batch_size, shuffle=shuffle)
test_dataloader = DataLoader(test_data_set, batch_size=batch_size, shuffle=shuffle)
print("")
model_test = NeuralNetwork()
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adamax(model_test.parameters(), lr=learning_rate, betas=(beta_1, beta_2), weight_decay=0.0)
print(model_test)
for t in range(epochs):
print(f'Epoch{t + 1}\n-------------')
train_loop(train_dataloader, model_test, loss_fn, optimizer)
test_loop(test_dataloader, model_test, loss_fn)
print("Done!")
4. 定义训练阶段,从DataLoader中取出数据,这⾥X,y分别为batch_size n,batch_size1的数据。
⾸先要进⾏⼀个调整,将X调整为batch_size1n的float,设置float的转化过程放在Dataset的初始化函数⾥完成了注意:如果没有这⼀步会报错
期望是long但得到了float的错误。(虽然我也不明⽩为啥错误不是期望)
y为1*batch的数组并转化成long(这⾥y的形式可能与损失函数有关)
def train_loop(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
for batch, (X, y) in enumerate(dataloader):
X = X.unsqueeze(1)
y=y.squeeze(1).long()typescript 字符串转数组
pred = model(X)
loss = loss_fn(pred, y)
<_grad()
loss.backward()
optimizer.step()
5. 定义测试过程(同上)
def test_loop(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_loss, correct = 0, 0
_grad():
for X, y in dataloader:
X = X.unsqueeze(1)
y=y.squeeze(1).long()
pred = model(X)
test_loss = loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
⼤体流程就是这些,最后记得修改加⼊输出语句与保存模型等操作。
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论