7  Defining Model

7.1 Introduction to the torch.nn module

So far, we have explored various components of Pytorch, such as tensor manipulation, data loading, and parameter optimization. In this chapter, we will delve further into Pytorch by learning about the torch.nn module, which is designed for building and training machine learning models, particularly neural networks. The torch.nn module has a simple and pythonic API that makes it easy to prototype and create complex models with just a few lines of code.

7.2 Exercise: Linear Regression

To continue our example of linear regression, we will now see how to use the torch.nn module to replace our custom model class. Before we do that, we will first generate a random linear dataset with four features and split the data into training and testing sets. Then, we will create custom Dataset and DataLoader objects to load the training and testing data in mini-batches.

Code
## Importing required functions
import torch
from torch import nn
import numpy as np
from sklearn.datasets import make_regression
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split

## Generate dataset with linear property
X, y, coef = make_regression(
    n_samples=1500,
    n_features=4,  ## Using four features
    n_informative=4,
    noise=0.3,
    coef=True,
    random_state=0,
    bias=2)


## Creating our custom TabularDataset
class TabularDataset(Dataset):

    def __init__(self, data, targets):
        self.data = data
        self.targets = targets

    def __len__(self):
        return self.data.shape[0]

    def __getitem__(self, idx):
        current_sample = self.data[idx]
        current_target = self.targets[idx]
        return {
            "X": torch.tensor(current_sample, dtype=torch.float),
            "y": torch.tensor(current_target, dtype=torch.float)
        }


## Making a train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

## Creating Tabular Dataset
train_dataset = TabularDataset(X_train, y_train)
test_dataset = TabularDataset(X_test, y_test)

## Creating Dataloaders
train_dataloader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=64, shuffle=False)

## Training loop
def train_one_epoch(model, data_loader, optimizer):
    for batch in iter(data_loader):
        ## Taking one mini-batch
        y_pred = model.forward(batch['X']).squeeze()
        y_true = batch['y']
        
        ## Calculation mean square error per min-batch
        loss = torch.square(y_pred - y_true).sum()
    
        ## Computing gradients per mini-batch
        loss.backward()
        
        ## Update model parameters and zero grad
        optimizer.step()
        optimizer.zero_grad()
        
## Validation loop
def validate_one_epoch(model, data_loader, optimizer):
    loss = 0
    with torch.no_grad():
        for batch in iter(data_loader):
            y_pred = model.forward(batch['X']).squeeze()
            y_true = batch['y']
            loss += torch.square(y_pred- y_true).sum()
    return loss/len(data_loader)

The torch.nn module contains several predefined layers that can be used to create neural networks. These layers can be found in the official PyTorch documentation for the torch.nn module. By using these predefined layers, we can simplify the process of building and training our model, as we don’t have to worry about implementing the details of each layer ourselves. Instead, we can simply specify the layers we want to use and let PyTorch handle the rest.

Now let’s rewrite the model class using torch.nn module.

class Linear(nn.Module):

    def __init__(self, n_in, n_out):
        super().__init__()
        self.linear = nn.Linear(n_in, n_out)

    def forward(self, x):
        return self.linear(x)

## Initializing model
model = Linear(X.shape[1], 1)
print(f"Model: \n{model}")

print(f"Weights")
print(list(model.parameters())[0])

print(f"Bias")
print(list(model.parameters())[1])
Model: 
Linear(
  (linear): Linear(in_features=4, out_features=1, bias=True)
)
Weights
Parameter containing:
tensor([[ 0.4191,  0.2242, -0.1830,  0.0542]], requires_grad=True)
Bias
Parameter containing:
tensor([0.4944], requires_grad=True)

The code above defines a class called Linear which extends the functionality of the nn.Module class from PyTorch’s torch.nn module. The Linear class has two methods: __init__ and forward.

  • The __init__ method is the constructor for the class. It takes two arguments: n_in and n_out, which represent the number of input and output features, respectively. The method initializes the parent class using super().__init__() and then creates a linear layer using nn.Linear. This layer will have n_in input features and n_out output features.
  • The forward method takes an input tensor x and applies the linear layer to it, returning the result.

After the Linear class is defined, an instance of the class is created and assigned to the model variable. The model object has two learnable parameters: the weights and the bias of the linear layer. These parameters can be accessed using the parameters method and indexed using square brackets. The weights are the first element in the list of parameters, and the bias is the second element.

Now let’s run through some epochs and train our model. We are using the same optimizer, train_one_epoch, and validate_one_epoch from the last chapter.

optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
for epoch in range(10):    
    # run one training loop
    train_one_epoch(model, train_dataloader, optimizer)
    # run validation loop on training to compute training loss
    train_loss = validate_one_epoch(model, train_dataloader, optimizer)
    # run validation loop on testing to compute test loss
    test_loss = validate_one_epoch(model, test_dataloader, optimizer)
    
    print(f"Epoch {epoch},Train MSE: {train_loss:.4f} Test MSE: {test_loss:.3f}")
    
print(f"Actual coefficients are: \n{np.round(coef,4)} \nTrained model weights are: \n{np.round(list(model.parameters())[0].detach().numpy()[0],4)}")
print(f"Actual Bias term is {2} \nTrained model bias term is \n{list(model.parameters())[1].detach().numpy()[0]:.4f}")
Epoch 0,Train MSE: 14168.0879 Test MSE: 16699.098
Epoch 1,Train MSE: 285.8107 Test MSE: 347.672
Epoch 2,Train MSE: 11.2080 Test MSE: 13.391
Epoch 3,Train MSE: 5.7762 Test MSE: 5.876
Epoch 4,Train MSE: 5.6652 Test MSE: 5.653
Epoch 5,Train MSE: 5.6483 Test MSE: 5.556
Epoch 6,Train MSE: 5.6559 Test MSE: 5.576
Epoch 7,Train MSE: 5.6767 Test MSE: 5.539
Epoch 8,Train MSE: 5.6488 Test MSE: 5.557
Epoch 9,Train MSE: 5.6495 Test MSE: 5.552
Actual coefficients are: 
[63.0061 44.1452 84.3648  9.3378] 
Trained model weights are: 
[62.9917 44.1615 84.3643  9.3292]
Actual Bias term is 2 
Trained model bias term is 
1.9978

As shown above, our model has fit the data well, just like the last chapter.

7.3 Saving and Loading models

If we want to save only the learned parameters from the model, we can use torch.save(model.state_dict()) as follows:

path = "../models/linear_model.pt"
torch.save(model.state_dict(), path)

To reload the saved parameters, we first need to initiate the model object and feed the saved model parameters.

model_new = Linear(X.shape[1], 1)
model_new.load_state_dict(torch.load(path))
print(f"Loaded model weights are: \n{np.round(list(model_new.parameters())[0].detach().numpy()[0],4)}")
print(f"\nLoaded model bias term is \n{list(model_new.parameters())[1].detach().numpy()[0]:.4f}")
Loaded model weights are: 
[62.9917 44.1615 84.3643  9.3292]

Loaded model bias term is 
1.9978

7.4 Exercise: What is torch.nn really?

Now that we have a good understanding of the fundamental concepts of Pytorch, I highly recommend reading the tutorial by Jeremy Howard from fast.ai titled “WHAT IS TORCH.NN REALLY?”. This tutorial covers everything we have learned so far and goes into more depth on the torch.nn module by showing how to implement it from scratch. It also introduces a new design pattern for building models using the nn.Sequential object, which allows you to define a model as a sequential chain of different layers. This is a simpler way of creating neural networks compared to writing them from scratch using the nn.Module class

7.5 References