前言

最近在读从零开始的大模型开发，本文以记录第二章的PyTorch2.0的环境搭建以及第一个深度学习模型实战之图像降噪。

环境搭建之安装PyTorch 2.0

安装anaconda：Anaconda | The Operating System for AI

安装PyTorch：PyTorch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

图像降噪实战

Step.1

首先是准备MNIST数据集，即深度学习编程中的Hello world。

该数据集包括：

训练数据集
训练数据集标签
测试数据集
测试数据集标签

MNIST数据集一共有七万张图片，其中六万张是训练集，一万张是测试集。每一张是784（28 x 28）字节的0~9的数字，黑底白字。

在网上以及文章中找到的MNIST训练集内部文件结构：

TRAINING SET LABEL FILE (train-labels-idx1-ubyte):
 
[offset] [type]          [value]          [description] 
0000     32 bit integer  0x00000801(2049) magic number (MSB first) 
0004     32 bit integer  60000            number of items 
0008     unsigned byte   ??               label 
0009     unsigned byte   ??               label 
........ 
xxxx     unsigned byte   ??               label
The labels values are 0 to 9.

可以看到里面有6万个标签，每个标签的值为一个0~9的数字。

首先该数据集是二进制文件，所以是以rb的方式读取，而且真正的数据在value的这项里。在读取数据时，首先要读取4个32位int，分别是magic number、number of image、number of rows、number of columns。

对于数据集的获取网上也有很多代码，本文是以书籍作为导向进行学习，用的是书籍作者给的代码，在这里就不多赘述了；

Step.2

对于PyTorch的深度学习的项目，模型的设计是一个非常重要的内容。

模型用于决定在深度学习项目中采用哪种方式完成目标的主体设计，在该项目中，我们的目的就是输入一幅图像，然后对其进行去噪。

在文章中，对于模型的选择思路为“图像输出大小就应该是输入的大小”，选择了Unet（一种卷积神经网络）作为设计的主要模型。

而笔者现在几乎属于是零知识储备的学习，所以Unet的代码实现以及结构在本文章中不会进行详细说明，下一篇文章将会深入学习该模型：

占位：Unet模型详细学习

Step.3

要完成一个深度学习项目，除了深度学习模型，还需要一个内容就是设定模型的损失函数与优化函数。

对于损失函数的选择，选用了MSELoss作为损失函数，即均方损失函数。

MSELoss的作用是计算预测值和真实值之间的欧式距离，预测值和真实值越接近，两者的均方差就越小。代码为：

loss = torch.nn.MSELoss(reduction="sum")(pred, y_batch)

优化函数的设定选用了Adam优化器，代码为：

optimizer = torch.optim.Adam(model.parameters(), lr=2e-5)

Step.4

接下来就是使用PyTorch训练出一个可以实现去噪性能的深度学习整理模型，代码如下：

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import torch
import numpy as np
import unet
import matplotlib.pyplot as plt
from tqdm import tqdm

batch_size = 320 #设定每次训练的批次数
epochs = 1024 #训练次数

device = "cuda"

model = unet.Unet() #导入Unet模型
model = model.to(device) #将计算模型传入GPU硬件等待计算
optimizer = torch.optim.Adam(model.parameters(), lr=2e-5) #设定优化函数

#载入数据
x_train = np.load("./MNIST/mnist/mnist/x_train.npy")
y_train_label = np.load("./MNIST/mnist/mnist/y_train_label.npy")

x_train_batch = []
for i in range(len(y_train_label)):
    if y_train_label[i] <= 10:
        x_train_batch.append(x_train[i])

x_train = np.reshape(x_train_batch, [-1, 1, 28, 28]) #修正数据输入维度：([30596, 28, 28)]
x_train /= 512
train_length = len(x_train) * 20 #增加数据的单词循环次数

for epoch in range(30):
    train_num = train_length//batch_size #计算有多少批次数

    train_loss = 0 #损失函数的统计
    for i in tqdm(range(train_num)): #开始训环训练
        x_imgs_batch = []
        x_step_batch = []
        y_batch = []
        #对每个批次内的数据进行处理
        for b in range(batch_size):
            img = x_train[np.random.randint(x_train.shape[0])] #提取单个图片内容
            x = img
            y = img

            x_imgs_batch.append(x)
            y_batch.append(y)

        #将批次数据转化为PyTorch对应的tensor格式并将其传入GPU中
        x_imgs_batch = torch.tensor(x_imgs_batch).float().to(device)
        y_batch = torch.tensor(y_batch).float().to(device)

        pred = model(x_imgs_batch) #对模型进行正向计算
        loss = torch.nn.MSELoss(reduction='sum')(pred, y_batch) * 100 #使用MESLoss损失函数进行计算

        #固定格式，一般这样使用
        optimizer.zero_grad() #对结果进行优化计算
        loss.backward() #损失值的反向传播
        optimizer.step() # 对参数进行更新

        train_loss += loss.item() #记录每个批次的损失值
    #计算并打印损失值
    train_loss /= train_num
    print("Train_loss: ", train_loss)

    #对数据进行打印
    image = x_train[np.random.randint(x_train.shape[0])] #随机挑选一条数据进行计算
    image = np.reshape(image, [1, 1, 28, 28]) #修正数据维度

    image = torch.tensor(image).float().to(device) #挑选的数据传入硬件中等待计算
    image = model(image) #使用模型对数据进行计算

    image = torch.reshape(image, shape=[28, 28]) #修正模型输出结果
    image = image.detach().cpu().numpy() #将计算结果导入CPU中进行后续计算或展示

    #展示或计算数据结果
    plt.imshow(image)
    plt.savefig(f"./img/img_{epoch}.png")

总结

在本文章中主要是对于基于PyTorch的深度学习做一个初步的了解，并通过第一个深度学习的项目——图像降噪来进行一个神经网络、深度学习等代码的学习

Welcome to Berial's blog about Pwn

深度学习模型实战之图像降噪

前言