from fastai.vision.all import *
= Path("../data/mnist_png/") dPath
11 Modeling pipeline with fastai’s Mid-Level API (draft)
This chapter will cover the process of training a model for multi-class classification on MNIST data using the fastai mid-level API. The image below illustrates the general steps involved in using the mid-level API:
11.1 Download data
Download instruction can be found in Downloading Data from Kaggle
section of Modeling pipeline with Neural Networks chapter.
11.2 Creating a dataloader using DataBlock API
To begin, we will import the fastai.vision
module, as we are working with image classification tasks in this case.
Let’s review the fastai
Datablock
API before proceeding to create our dataloaders.
The fastai
DataBlock
API is a powerful tool for preparing data for deep learning models. The high level process of using this API involves the following steps:
Defining blocks: The first step is to define the blocks that make up your data. For example, you may have an image dataset where each image is associated with a label. In this case, you would define an
ImageBlock
for the images and aCategoryBlock
for the labels.Getting inputs and labels: The next step is to get your inputs and labels into the appropriate format for the blocks you defined. This may involve loading images from disk, resizing them, and converting them to tensors. You may also need to preprocess your labels, for example, by converting them to numerical values.
Splitting the data: Once your data is in the appropriate format, you can split it into training, validation, and test sets. The datablock API provides convenient methods for doing this, such as using a random or stratified split. Refer to data transformations documentation for complete list of avaialble splitting options.
Applying item transforms: Before your data can be fed into a model, you may need to apply item transforms. These are transformations that are applied to each item in the dataset, such as random cropping or flipping. The datablock API allows you to specify these transforms using the
item_tfms
argument.Applying batch transforms: In addition to item transforms, you may want to apply transforms to batches of items. For example, you may want to normalize the pixel values across a batch of images. The datablock API allows you to specify batch transforms using the
batch_tfms
argument.Creating dataloaders: Finally, you can create dataloaders for your training, validation, and test sets using the
dataloaders
method. These dataloaders will take care of loading your data in batches, applying transforms, and shuffling the data during training.
By following these steps, you can use the fastai datablock API to easily prepare your data for deep learning models.
Next, we will create ourfastai
DataBlock
object using the DataBlock API
.
= DataBlock(
dataset = (ImageBlock(cls = PILImageBW), CategoryBlock),
blocks = get_image_files,
get_items = GrandparentSplitter(train_name='training', valid_name='testing'),
splitter = parent_label,
get_y = Resize(28),
item_tfms = None
batch_tfms
)
= dataset.dataloaders(dPath, bs=128) dls
print(dls.vocab) ## Prints class labels
print(dls.c) ## Prints number of classes
=24,figsize=(10,6)) ## Show sample data dls.show_batch(max_n
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
10
0].shape, dls.one_batch()[1].shape dls.one_batch()[
(torch.Size([128, 1, 28, 28]), torch.Size([128]))
class MLP(nn.Module):
def __init__(self, n_in, n_out):
super().__init__()
self.model = nn.Sequential(
256),
nn.Linear(n_in,
nn.ReLU(),256, 128),
nn.Linear(
nn.ReLU(),128, n_out)
nn.Linear(
)def forward(self, x):
return self.model(x.view(-1,784))
## Defining the learner
= MLP(784, 10)
model = Learner(
mlp_learner = dls,
dls =model,
model=F.cross_entropy,
loss_func=dPath/"models",
model_dir=accuracy) metrics
## Finidng Ideal learning late
mlp_learner.lr_find()
5,5e-2) mlp_learner.fit_one_cycle(