So far, we have explored various components of Pytorch, such as tensor manipulation, data loading, and parameter optimization. In this chapter, we will delve further into Pytorch by learning about the torch.nn module, which is designed for building and training machine learning models, particularly neural networks. The torch.nn module has a simple and pythonic API that makes it easy to prototype and create complex models with just a few lines of code.
7.2 Exercise: Linear Regression
To continue our example of linear regression, we will now see how to use the torch.nn module to replace our custom model class. Before we do that, we will first generate a random linear dataset with four features and split the data into training and testing sets. Then, we will create custom Dataset and DataLoader objects to load the training and testing data in mini-batches.
Code
## Importing required functionsimport torchfrom torch import nnimport numpy as npfrom sklearn.datasets import make_regressionfrom torch.utils.data import Dataset, DataLoaderfrom sklearn.model_selection import train_test_split## Generate dataset with linear propertyX, y, coef = make_regression( n_samples=1500, n_features=4, ## Using four features n_informative=4, noise=0.3, coef=True, random_state=0, bias=2)## Creating our custom TabularDatasetclass TabularDataset(Dataset):def__init__(self, data, targets):self.data = dataself.targets = targetsdef__len__(self):returnself.data.shape[0]def__getitem__(self, idx): current_sample =self.data[idx] current_target =self.targets[idx]return {"X": torch.tensor(current_sample, dtype=torch.float),"y": torch.tensor(current_target, dtype=torch.float) }## Making a train-test splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)## Creating Tabular Datasettrain_dataset = TabularDataset(X_train, y_train)test_dataset = TabularDataset(X_test, y_test)## Creating Dataloaderstrain_dataloader = DataLoader(train_dataset, batch_size=64, shuffle=True)test_dataloader = DataLoader(test_dataset, batch_size=64, shuffle=False)## Training loopdef train_one_epoch(model, data_loader, optimizer):for batch initer(data_loader):## Taking one mini-batch y_pred = model.forward(batch['X']).squeeze() y_true = batch['y']## Calculation mean square error per min-batch loss = torch.square(y_pred - y_true).sum()## Computing gradients per mini-batch loss.backward()## Update model parameters and zero grad optimizer.step() optimizer.zero_grad()## Validation loopdef validate_one_epoch(model, data_loader, optimizer): loss =0with torch.no_grad():for batch initer(data_loader): y_pred = model.forward(batch['X']).squeeze() y_true = batch['y'] loss += torch.square(y_pred- y_true).sum()return loss/len(data_loader)
The torch.nn module contains several predefined layers that can be used to create neural networks. These layers can be found in the official PyTorch documentation for the torch.nn module. By using these predefined layers, we can simplify the process of building and training our model, as we don’t have to worry about implementing the details of each layer ourselves. Instead, we can simply specify the layers we want to use and let PyTorch handle the rest.
Now let’s rewrite the model class using torch.nn module.
The code above defines a class called Linear which extends the functionality of the nn.Module class from PyTorch’s torch.nn module. The Linear class has two methods: __init__ and forward.
The __init__ method is the constructor for the class. It takes two arguments: n_in and n_out, which represent the number of input and output features, respectively. The method initializes the parent class using super().__init__() and then creates a linear layer using nn.Linear. This layer will have n_in input features and n_out output features.
The forward method takes an input tensor x and applies the linear layer to it, returning the result.
After the Linear class is defined, an instance of the class is created and assigned to the model variable. The model object has two learnable parameters: the weights and the bias of the linear layer. These parameters can be accessed using the parameters method and indexed using square brackets. The weights are the first element in the list of parameters, and the bias is the second element.
Now let’s run through some epochs and train our model. We are using the same optimizer, train_one_epoch, and validate_one_epoch from the last chapter.
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)for epoch inrange(10): # run one training loop train_one_epoch(model, train_dataloader, optimizer)# run validation loop on training to compute training loss train_loss = validate_one_epoch(model, train_dataloader, optimizer)# run validation loop on testing to compute test loss test_loss = validate_one_epoch(model, test_dataloader, optimizer)print(f"Epoch {epoch},Train MSE: {train_loss:.4f} Test MSE: {test_loss:.3f}")print(f"Actual coefficients are: \n{np.round(coef,4)}\nTrained model weights are: \n{np.round(list(model.parameters())[0].detach().numpy()[0],4)}")print(f"Actual Bias term is {2}\nTrained model bias term is \n{list(model.parameters())[1].detach().numpy()[0]:.4f}")
Epoch 0,Train MSE: 14168.0879 Test MSE: 16699.098
Epoch 1,Train MSE: 285.8107 Test MSE: 347.672
Epoch 2,Train MSE: 11.2080 Test MSE: 13.391
Epoch 3,Train MSE: 5.7762 Test MSE: 5.876
Epoch 4,Train MSE: 5.6652 Test MSE: 5.653
Epoch 5,Train MSE: 5.6483 Test MSE: 5.556
Epoch 6,Train MSE: 5.6559 Test MSE: 5.576
Epoch 7,Train MSE: 5.6767 Test MSE: 5.539
Epoch 8,Train MSE: 5.6488 Test MSE: 5.557
Epoch 9,Train MSE: 5.6495 Test MSE: 5.552
Actual coefficients are:
[63.0061 44.1452 84.3648 9.3378]
Trained model weights are:
[62.9917 44.1615 84.3643 9.3292]
Actual Bias term is 2
Trained model bias term is
1.9978
As shown above, our model has fit the data well, just like the last chapter.
7.3 Saving and Loading models
If we want to save only the learned parameters from the model, we can use torch.save(model.state_dict()) as follows:
To reload the saved parameters, we first need to initiate the model object and feed the saved model parameters.
model_new = Linear(X.shape[1], 1)model_new.load_state_dict(torch.load(path))print(f"Loaded model weights are: \n{np.round(list(model_new.parameters())[0].detach().numpy()[0],4)}")print(f"\nLoaded model bias term is \n{list(model_new.parameters())[1].detach().numpy()[0]:.4f}")
Loaded model weights are:
[62.9917 44.1615 84.3643 9.3292]
Loaded model bias term is
1.9978
7.4 Exercise: What is torch.nn really?
Now that we have a good understanding of the fundamental concepts of Pytorch, I highly recommend reading the tutorial by Jeremy Howard from fast.ai titled “WHAT IS TORCH.NN REALLY?”. This tutorial covers everything we have learned so far and goes into more depth on the torch.nn module by showing how to implement it from scratch. It also introduces a new design pattern for building models using the nn.Sequential object, which allows you to define a model as a sequential chain of different layers. This is a simpler way of creating neural networks compared to writing them from scratch using the nn.Module class