Jovian
⭐️
Sign In

Image Classification with PyTorch

"Traditionally, the only way to get a computer to do something -- from adding two numbers to flying an airplane --was to write down an algorithm explaining how, in painstaking detail. But machine learning, also known as learners, are different: they figure it out on their own, by making inferences from data. And the more data they have, the better the get. Now we don't have to program computers: they program themselves." (from "The Master Algorithm by Pedro Domingo)

Image Classification is a supervised learning problem: define a set of target classes (objects to identify in images), and train a model to recognize them using labeled example photos.

This notebook presents the techniques using PyTorch from the series https://jovian.ai/learn/deep-learning-with-pytorch-zero-to-gans applied to the CIFAR100 dataset.

I learned a lot. I also learned that I just scratched the surface; there's still deep learning and countless hours of training I need before I can comfortably say - yes "I GOT IT!"

In [3]:
!pip install jovian --upgrade --quiet
In [ ]:
#import os
import torch
import torchvision
import numpy as np
#import tarfile
import torch.nn as nn
import torch.nn.functional as F
#from torchvision.datasets.utils import download_url
from torchvision.datasets import CIFAR100
#from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
from torchvision.utils import make_grid
from torch.utils.data import random_split
from torch.utils.data import DataLoader

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

matplotlib.rcParams['figure.facecolor'] = '#ffffff'

CIFAR100 Dataset

In [4]:
project_name='image-classification-project-2021'

Download the Dataset

In [ ]:
dataset = CIFAR100(root = 'data/', download = True, transform = ToTensor())
test_ds = CIFAR100(root = 'data/', train = False, transform = ToTensor())
#train_dataset = CIFAR100(root = 'data/', download = True, transform = train_transform)
#test_dataset = CIFAR100(root = 'data/', train = False, transform = test_transform)
Downloading https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz to data/cifar-100-python.tar.gz
HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))
Extracting data/cifar-100-python.tar.gz to data/

Explore the Dataset

In [ ]:
print(dataset)
print(test_ds)
Dataset CIFAR100 Number of datapoints: 50000 Root location: data/ Split: Train StandardTransform Transform: ToTensor() Dataset CIFAR100 Number of datapoints: 10000 Root location: data/ Split: Test StandardTransform Transform: ToTensor()
In [ ]:
dataset_size = len(dataset)
test_ds_size = len(test_ds)
print('total images  in dataset:', dataset_size)
print('total images in test dataset:', test_ds_size)
total images in dataset: 50000 total images in test dataset: 10000

Let's find out how many output classes CIFAR100 has

In [ ]:
print('total classes:', len(dataset.classes))
print(dataset.classes)
total classes: 100 ['apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle', 'bicycle', 'bottle', 'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel', 'can', 'castle', 'caterpillar', 'cattle', 'chair', 'chimpanzee', 'clock', 'cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur', 'dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster', 'house', 'kangaroo', 'keyboard', 'lamp', 'lawn_mower', 'leopard', 'lion', 'lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain', 'mouse', 'mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear', 'pickup_truck', 'pine_tree', 'plain', 'plate', 'poppy', 'porcupine', 'possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket', 'rose', 'sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake', 'spider', 'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table', 'tank', 'telephone', 'television', 'tiger', 'tractor', 'train', 'trout', 'tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman', 'worm']

Let's find out the shape of an image tensor from our dataset

In [ ]:
for img, label in dataset:
  print('img.shape: ', img.shape)
  print('label: ', label)
  print()
  print('img tensor: ') 
  print(img)
  break
img.shape: torch.Size([3, 32, 32]) label: 19 img tensor: tensor([[[1.0000, 1.0000, 1.0000, ..., 0.7647, 0.8314, 0.7137], [1.0000, 0.9961, 0.9961, ..., 0.6667, 0.6314, 0.5725], [1.0000, 0.9961, 1.0000, ..., 0.7412, 0.6510, 0.4745], ..., [0.5804, 0.5569, 0.5490, ..., 0.1176, 0.2549, 0.2980], [0.4784, 0.4706, 0.4941, ..., 0.0863, 0.3804, 0.5529], [0.3412, 0.3451, 0.3961, ..., 0.1333, 0.4118, 0.5412]], [[1.0000, 1.0000, 1.0000, ..., 0.8039, 0.8784, 0.7608], [1.0000, 0.9961, 0.9961, ..., 0.6902, 0.6588, 0.6039], [1.0000, 0.9961, 1.0000, ..., 0.7804, 0.6980, 0.5216], ..., [0.7255, 0.7137, 0.7020, ..., 0.0667, 0.2431, 0.3020], [0.6157, 0.6078, 0.6275, ..., 0.0627, 0.4392, 0.6314], [0.4784, 0.4784, 0.5255, ..., 0.1412, 0.5216, 0.6784]], [[1.0000, 1.0000, 1.0000, ..., 0.7569, 0.8000, 0.6549], [1.0000, 0.9961, 0.9961, ..., 0.5882, 0.5098, 0.4431], [1.0000, 0.9961, 1.0000, ..., 0.6627, 0.5098, 0.3412], ..., [0.3098, 0.2235, 0.2353, ..., 0.0039, 0.0588, 0.0784], [0.2588, 0.2275, 0.2784, ..., 0.0118, 0.2196, 0.3412], [0.1608, 0.1529, 0.2196, ..., 0.0392, 0.2314, 0.3098]]])
In [ ]:
dataset_classes_dict = dict()

for item in dataset:
    label = dataset.classes[item[1]]
    if label in dataset_classes_dict:
      dataset_classes_dict[label] += 1
    else:
      dataset_classes_dict[label] = 1

dataset_classes_dict
Out[]:
{'apple': 500,
 'aquarium_fish': 500,
 'baby': 500,
 'bear': 500,
 'beaver': 500,
 'bed': 500,
 'bee': 500,
 'beetle': 500,
 'bicycle': 500,
 'bottle': 500,
 'bowl': 500,
 'boy': 500,
 'bridge': 500,
 'bus': 500,
 'butterfly': 500,
 'camel': 500,
 'can': 500,
 'castle': 500,
 'caterpillar': 500,
 'cattle': 500,
 'chair': 500,
 'chimpanzee': 500,
 'clock': 500,
 'cloud': 500,
 'cockroach': 500,
 'couch': 500,
 'crab': 500,
 'crocodile': 500,
 'cup': 500,
 'dinosaur': 500,
 'dolphin': 500,
 'elephant': 500,
 'flatfish': 500,
 'forest': 500,
 'fox': 500,
 'girl': 500,
 'hamster': 500,
 'house': 500,
 'kangaroo': 500,
 'keyboard': 500,
 'lamp': 500,
 'lawn_mower': 500,
 'leopard': 500,
 'lion': 500,
 'lizard': 500,
 'lobster': 500,
 'man': 500,
 'maple_tree': 500,
 'motorcycle': 500,
 'mountain': 500,
 'mouse': 500,
 'mushroom': 500,
 'oak_tree': 500,
 'orange': 500,
 'orchid': 500,
 'otter': 500,
 'palm_tree': 500,
 'pear': 500,
 'pickup_truck': 500,
 'pine_tree': 500,
 'plain': 500,
 'plate': 500,
 'poppy': 500,
 'porcupine': 500,
 'possum': 500,
 'rabbit': 500,
 'raccoon': 500,
 'ray': 500,
 'road': 500,
 'rocket': 500,
 'rose': 500,
 'sea': 500,
 'seal': 500,
 'shark': 500,
 'shrew': 500,
 'skunk': 500,
 'skyscraper': 500,
 'snail': 500,
 'snake': 500,
 'spider': 500,
 'squirrel': 500,
 'streetcar': 500,
 'sunflower': 500,
 'sweet_pepper': 500,
 'table': 500,
 'tank': 500,
 'telephone': 500,
 'television': 500,
 'tiger': 500,
 'tractor': 500,
 'train': 500,
 'trout': 500,
 'tulip': 500,
 'turtle': 500,
 'wardrobe': 500,
 'whale': 500,
 'willow_tree': 500,
 'wolf': 500,
 'woman': 500,
 'worm': 500}
In [ ]:
test_classes_dict = dict()

for test_item in test_ds:
    label = test_ds.classes[test_item[1]]
    if label in test_classes_dict:
      test_classes_dict[label] += 1
    else:
      test_classes_dict[label] = 1

test_classes_dict
Out[]:
{'apple': 100,
 'aquarium_fish': 100,
 'baby': 100,
 'bear': 100,
 'beaver': 100,
 'bed': 100,
 'bee': 100,
 'beetle': 100,
 'bicycle': 100,
 'bottle': 100,
 'bowl': 100,
 'boy': 100,
 'bridge': 100,
 'bus': 100,
 'butterfly': 100,
 'camel': 100,
 'can': 100,
 'castle': 100,
 'caterpillar': 100,
 'cattle': 100,
 'chair': 100,
 'chimpanzee': 100,
 'clock': 100,
 'cloud': 100,
 'cockroach': 100,
 'couch': 100,
 'crab': 100,
 'crocodile': 100,
 'cup': 100,
 'dinosaur': 100,
 'dolphin': 100,
 'elephant': 100,
 'flatfish': 100,
 'forest': 100,
 'fox': 100,
 'girl': 100,
 'hamster': 100,
 'house': 100,
 'kangaroo': 100,
 'keyboard': 100,
 'lamp': 100,
 'lawn_mower': 100,
 'leopard': 100,
 'lion': 100,
 'lizard': 100,
 'lobster': 100,
 'man': 100,
 'maple_tree': 100,
 'motorcycle': 100,
 'mountain': 100,
 'mouse': 100,
 'mushroom': 100,
 'oak_tree': 100,
 'orange': 100,
 'orchid': 100,
 'otter': 100,
 'palm_tree': 100,
 'pear': 100,
 'pickup_truck': 100,
 'pine_tree': 100,
 'plain': 100,
 'plate': 100,
 'poppy': 100,
 'porcupine': 100,
 'possum': 100,
 'rabbit': 100,
 'raccoon': 100,
 'ray': 100,
 'road': 100,
 'rocket': 100,
 'rose': 100,
 'sea': 100,
 'seal': 100,
 'shark': 100,
 'shrew': 100,
 'skunk': 100,
 'skyscraper': 100,
 'snail': 100,
 'snake': 100,
 'spider': 100,
 'squirrel': 100,
 'streetcar': 100,
 'sunflower': 100,
 'sweet_pepper': 100,
 'table': 100,
 'tank': 100,
 'telephone': 100,
 'television': 100,
 'tiger': 100,
 'tractor': 100,
 'train': 100,
 'trout': 100,
 'tulip': 100,
 'turtle': 100,
 'wardrobe': 100,
 'whale': 100,
 'willow_tree': 100,
 'wolf': 100,
 'woman': 100,
 'worm': 100}

View some elements in the dataset

We can view the image using matplotlib, but we need to change the tensor dimensions to (32,32,3). Let's create a helper function to display an image and its label.

In [ ]:
def show_example(img, label):
    print('Label: ', dataset.classes[label], "("+str(label)+")")
    plt.imshow(img.permute(1, 2, 0))
In [ ]:
img, label = dataset[0]
show_example(img, label)
Label: cattle (19)
Notebook Image
In [ ]:
show_example(*dataset[7500])
Label: rose (70)
Notebook Image

Save and upload our notebook to Jovian

In [ ]:
!pip install jovian --upgrade -q
In [ ]:
import jovian
In [ ]:
jovian.commit(project=project_name)
[jovian] Detected Colab notebook... [jovian] Please enter your API key ( from https://jovian.ai/ ): API KEY: ·········· [jovian] Uploading colab notebook to Jovian... [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ai/tessdja/image-classification-project-2021

Prepare the Data for Training

In [ ]:
random_seed = 43
torch.manual_seed(random_seed);
Use random_split method to create the training and validation sets
In [ ]:
val_size = 10000
train_size = len(dataset) - val_size

train_ds, val_ds = random_split(dataset, [train_size, val_size])
print('total images in our training set: ', len(train_ds))
print('total images in our validation set: ', len(val_ds))
total images in our training set: 40000 total images in our validation set: 10000

The jovian library also provides a simple API for recording important parameters related to the dataset, model training, results etc. for easy reference and comparison between multiple experiments. Let's record dataset_url, val_pct and rand_seed using jovian.log_dataset.

In [ ]:
# from torchvision.datasets import CIFAR100
dataset_url = 'https://www.cs.toronto.edu/~kriz/cifar.html'
jovian.log_dataset(dataset_url=dataset_url, val_size=val_size, random_seed=random_seed)
[jovian] Dataset logged.
Define the data loaders for training and validation, to load the data in batches
In [ ]:
batch_size=128
In [ ]:
train_dl = DataLoader(train_ds, 
                      batch_size, 
                      shuffle=True, 
                      num_workers=4, 
                      pin_memory=True)

val_dl = DataLoader(val_ds, 
                    batch_size,       # note lesson on feedforward has the *2
                    num_workers=4, 
                    pin_memory=True)

test_dl = DataLoader(test_ds, 
                    batch_size,      # note lesson on feedforward has *2 batch_size
                    num_workers=4, 
                    pin_memory=True)

Visualize a batch of data using the make_grid helper from Torchvision

In [ ]:
from torchvision.utils import make_grid

def show_batch(dl):
    for images, labels in dl:
        fig, ax = plt.subplots(figsize=(12, 6))
        ax.set_xticks([]) 
        ax.set_yticks([])
        ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))
        break
In [ ]:
show_batch(train_dl)
Notebook Image
In [ ]:
jovian.commit(project=project_name, environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Attaching records (metrics, hyperparameters, dataset etc.) [jovian] Committed successfully! https://jovian.ai/tessdja/image-classification-project-2021

Settings: Device and DeviceDataLoaders

To seamlessly use a GPU, if one is available, we define a couple of helper functions (get_default_device & to_device) and a helper class DeviceDataLoader to move our model & data to the GPU as required.

In [ ]:
def get_default_device():
    """Pick GPU if available, else CPU"""
    if torch.cuda.is_available():
        return torch.device('cuda')
    else:
        return torch.device('cpu')
    
def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    """Wrap a dataloader to move data to a device"""
    def __init__(self, dl, device):
        self.dl = dl
        self.device = device
        
    def __iter__(self):
        """Yield a batch of data after moving it to device"""
        for b in self.dl: 
            yield to_device(b, self.device)

    def __len__(self):
        """Number of batches"""
        return len(self.dl)
In [ ]:
torch.cuda.is_available()
Out[]:
True
In [ ]:
device = get_default_device()
device
Out[]:
device(type='cuda')

Let's move our data loaders to the appropriate device.

In [ ]:
train_dl = DeviceDataLoader(train_dl, device)
val_dl = DeviceDataLoader(val_dl, device)
test_dl = DeviceDataLoader(test_dl, device)

Let us also define a couple of helper functions for plotting the losses & accuracies.

In [ ]:
def plot_losses(history):
    losses = [x['val_loss'] for x in history]
    plt.plot(losses, '-x')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.title('Loss vs. No. of epochs');
In [ ]:
def plot_accuracies(history):
    accuracies = [x['val_acc'] for x in history]
    plt.plot(accuracies, '-x')
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.title('Accuracy vs. No. of epochs');

Let's create a base model class, which contains everything except the model architecture i.e. IT WILL NOT CONTAIN the init and forward methods.

In [ ]:
def accuracy(outputs, labels):
    _, preds = torch.max(outputs, dim=1)
    return torch.tensor(torch.sum(preds == labels).item() / len(preds))
In [ ]:
class ImageClassificationBase(nn.Module):
    def training_step(self, batch):
        images, labels = batch 
        out = self(images)                  # Generate predictions
        loss = F.cross_entropy(out, labels) # Calculate loss
        return loss
    
    def validation_step(self, batch):
        images, labels = batch 
        out = self(images)                    # Generate predictions
        loss = F.cross_entropy(out, labels)   # Calculate loss
        acc = accuracy(out, labels)           # Calculate accuracy
        return {'val_loss': loss.detach(), 'val_acc': acc}
        
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_accs = [x['val_acc'] for x in outputs]
        epoch_acc = torch.stack(batch_accs).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}
    
    def epoch_end(self, epoch, result):
        print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))

Train our First Model Using Feed Forward

"A feedforward network is a network that contains inputs, outputs, and hidden layers. The signals can only travel in one direction (forward). Input data passes into a layer where calculations are performed. Each processing element computes based upon the weighted sum of its inputs. The new values become the new input values that feed the next layer (feed-forward). This continues through all the layers and determines the output."

Source: https://towardsdatascience.com/simply-deep-learning-an-effortless-introduction-45591a1c4abb

In [ ]:
input_size = 3*32*32      #3*32*32=3072
output_size = len(dataset.classes)
hidden_size1 = 1034 
hidden_size2 = 1034 
In [ ]:
class CIFAR100Model(ImageClassificationBase):     # feed forward
    def __init__(self):
        super().__init__()
        # hidden layers
        self.linear1a = nn.Linear(input_size, hidden_size1)
        self.linear1b = nn.Linear(hidden_size1, hidden_size2)
        self.linear2 = nn.Linear(hidden_size2, output_size)
        
    def forward(self, xb):
        # Flatten images into vectors
        xb = xb.view(xb.size(0), -1)

        # Apply hidden layers & activation functions
        out = self.linear1a(xb)
        out = F.relu(out)
        out = self.linear1b(out)
        out = F.relu(out)

        #apply output layer
        out = self.linear2(out)
        return out
In [ ]:
def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        for batch in train_loader:
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result)
        history.append(result)
    return history

Now instantiate the model, and move it the appropriate device.

In [ ]:
model_ff = to_device(CIFAR100Model(), device)

Before you train the model, it's a good idea to check the validation loss & accuracy with the initial set of weights.

In [ ]:
history_ff = [evaluate(model_ff, val_dl)]
history_ff
Out[]:
[{'val_acc': 0.01068037934601307, 'val_loss': 4.6069231033325195}]

Train the model using the fit function to reduce the validation loss & improve accuracy.

In [ ]:
history_ff += fit(30, 0.1, model_ff, train_dl, val_dl)
Epoch [0], val_loss: 4.0449, val_acc: 0.0824 Epoch [1], val_loss: 3.8735, val_acc: 0.1056 Epoch [2], val_loss: 3.7286, val_acc: 0.1280 Epoch [3], val_loss: 3.6479, val_acc: 0.1408 Epoch [4], val_loss: 3.6600, val_acc: 0.1497 Epoch [5], val_loss: 3.5546, val_acc: 0.1585 Epoch [6], val_loss: 3.5533, val_acc: 0.1702 Epoch [7], val_loss: 3.5672, val_acc: 0.1691 Epoch [8], val_loss: 3.3534, val_acc: 0.2009 Epoch [9], val_loss: 3.3793, val_acc: 0.1975 Epoch [10], val_loss: 3.3574, val_acc: 0.2002 Epoch [11], val_loss: 3.3141, val_acc: 0.2111 Epoch [12], val_loss: 3.4315, val_acc: 0.1900 Epoch [13], val_loss: 3.3081, val_acc: 0.2084 Epoch [14], val_loss: 3.2463, val_acc: 0.2253 Epoch [15], val_loss: 3.3202, val_acc: 0.2179 Epoch [16], val_loss: 3.2573, val_acc: 0.2228 Epoch [17], val_loss: 3.2403, val_acc: 0.2314 Epoch [18], val_loss: 3.2819, val_acc: 0.2258 Epoch [19], val_loss: 3.2591, val_acc: 0.2334 Epoch [20], val_loss: 3.3017, val_acc: 0.2241 Epoch [21], val_loss: 3.4450, val_acc: 0.2096 Epoch [22], val_loss: 3.3608, val_acc: 0.2223 Epoch [23], val_loss: 3.2544, val_acc: 0.2373 Epoch [24], val_loss: 3.3379, val_acc: 0.2307 Epoch [25], val_loss: 3.3165, val_acc: 0.2345 Epoch [26], val_loss: 3.3505, val_acc: 0.2348 Epoch [27], val_loss: 3.3001, val_acc: 0.2511 Epoch [28], val_loss: 3.4070, val_acc: 0.2290 Epoch [29], val_loss: 3.4658, val_acc: 0.2289
In [ ]:
history_ff += fit(20, 0.01, model_ff, train_dl, val_dl)
Epoch [0], val_loss: 3.1604, val_acc: 0.2850 Epoch [1], val_loss: 3.1674, val_acc: 0.2885 Epoch [2], val_loss: 3.1711, val_acc: 0.2856 Epoch [3], val_loss: 3.1871, val_acc: 0.2848 Epoch [4], val_loss: 3.1815, val_acc: 0.2871 Epoch [5], val_loss: 3.1906, val_acc: 0.2856 Epoch [6], val_loss: 3.1939, val_acc: 0.2857 Epoch [7], val_loss: 3.2063, val_acc: 0.2863 Epoch [8], val_loss: 3.2096, val_acc: 0.2871 Epoch [9], val_loss: 3.2035, val_acc: 0.2872 Epoch [10], val_loss: 3.2174, val_acc: 0.2814 Epoch [11], val_loss: 3.2201, val_acc: 0.2852 Epoch [12], val_loss: 3.2196, val_acc: 0.2867 Epoch [13], val_loss: 3.2156, val_acc: 0.2870 Epoch [14], val_loss: 3.2289, val_acc: 0.2860 Epoch [15], val_loss: 3.2305, val_acc: 0.2842 Epoch [16], val_loss: 3.2421, val_acc: 0.2823 Epoch [17], val_loss: 3.2387, val_acc: 0.2853 Epoch [18], val_loss: 3.2496, val_acc: 0.2833 Epoch [19], val_loss: 3.2692, val_acc: 0.2839
In [ ]:
history_ff += fit(10, 0.001, model_ff, train_dl, val_dl)
Epoch [0], val_loss: 3.2446, val_acc: 0.2860 Epoch [1], val_loss: 3.2459, val_acc: 0.2876 Epoch [2], val_loss: 3.2464, val_acc: 0.2864 Epoch [3], val_loss: 3.2468, val_acc: 0.2861 Epoch [4], val_loss: 3.2471, val_acc: 0.2862 Epoch [5], val_loss: 3.2480, val_acc: 0.2872 Epoch [6], val_loss: 3.2477, val_acc: 0.2868 Epoch [7], val_loss: 3.2499, val_acc: 0.2860 Epoch [8], val_loss: 3.2490, val_acc: 0.2869 Epoch [9], val_loss: 3.2512, val_acc: 0.2861

Plot the losses and the accuracies to check if you're starting to hit the limits of how well your model can perform on this dataset.

In [ ]:
plot_losses(history_ff)
Notebook Image
In [ ]:
plot_accuracies(history_ff)
Notebook Image
In [ ]:
test_ff = evaluate(model_ff, test_dl)
test_ff
Out[]:
{'val_acc': 0.296281635761261, 'val_loss': 3.228579521179199}
In [ ]:
jovian.commit(project=project_name, environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Attaching records (metrics, hyperparameters, dataset etc.) [jovian] Committed successfully! https://jovian.ai/tessdja/image-classification-project-2021

Record our results for feed forward and include the following:

  • the model's architecture
  • learning rates used
  • number of epochs per layer
In [ ]:
arch_ff = "4 layers (1034, 1034, 10)"
lrs_ff = [0.1, 0.01, 0.001]
epochs_ff = [30, 20, 10]
In [ ]:
test_acc_ff = 0.296281635761261
test_loss_ff = 3.228579521179199
In [ ]:
torch.save(model_ff.state_dict(), 'cifar100-feedforward.pth')

The jovian library provides some utility functions to keep your work organized. With every version of your notebook, you can attach some hyperparameters and metrics from your experiment.

In [ ]:
# Clear previously recorded hyperparams & metrics
jovian.reset()
In [ ]:
jovian.log_hyperparams(arch=arch_ff, 
                       lrs=lrs_ff, 
                       epochs=epochs_ff)
[jovian] Hyperparams logged.
In [ ]:
jovian.log_metrics(test_loss=test_loss_ff, test_acc=test_acc_ff)
[jovian] Metrics logged.
In [ ]:
jovian.commit(project=project_name, outputs=['cifar100-feedforward.pth'], environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Uploading additional outputs... [jovian] Attaching records (metrics, hyperparameters, dataset etc.) [jovian] Committed successfully! https://jovian.ai/tessdja/image-classification-project-2021

Train our Second Model - Convolutional Neural Networks

The convolutional neural network (CNN) is a class of deep learning neural networks. They can be found at the core of everything from Facebook’s photo tagging to self-driving cars. They’re working hard behind the scenes in everything from healthcare to security.

A CNN has
  • Convolutional layers
  • ReLU layers
  • Pooling layers
  • a Fully connected layer

Source: https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb

Let's define the model by extending an ImageClassificationBase class which contains helper methods for training & validation.
We'll use nn.Sequential to chain the layers and activations functions into a single network architecture.

In [ ]:
class Cifar100CnnModel(ImageClassificationBase):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            # input: 3 x 32 x 32
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            # output: 32 x 32 x 32
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            # output: 64 x 32 x 32
            nn.ReLU(),
            # output: 64 x 32 x 32
            nn.MaxPool2d(2, 2), # output: 64 x 16 x 16

            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 128 x 8 x 8

            nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 256 x 4 x 4

            nn.Flatten(), 
            nn.Linear(256*4*4, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 100))
        
    def forward(self, xb):
        return self.network(xb)
In [ ]:
# the torch.no_grad only applies during the evaluate phase
@torch.no_grad()
def evaluate(model, val_loader):
    model.eval()
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        model.train()
        train_losses = []
        for batch in train_loader:
            loss = model.training_step(batch)
            train_losses.append(loss)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        result['train_loss'] = torch.stack(train_losses).mean().item()
        model.epoch_end(epoch, result)
        history.append(result)
    return history
In [ ]:
model_cnn = to_device(Cifar100CnnModel(), device)
model_cnn
Out[]:
Cifar100CnnModel(
  (network): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU()
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU()
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU()
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU()
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU()
    (14): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (15): Flatten(start_dim=1, end_dim=-1)
    (16): Linear(in_features=4096, out_features=1024, bias=True)
    (17): ReLU()
    (18): Linear(in_features=1024, out_features=512, bias=True)
    (19): ReLU()
    (20): Linear(in_features=512, out_features=100, bias=True)
  )
)
In [ ]:
history_cnn = [evaluate(model_cnn, val_dl)]
history_cnn
Out[]:
[{'val_acc': 0.009098101407289505, 'val_loss': 4.605696201324463}]

Train our CNN Model for CIFAR100

In [ ]:
num_epochs_cnn = 20
opt_func = torch.optim.Adam
lr_cnn = 0.001
In [ ]:
jovian.reset()
jovian.log_hyperparams({
    'num_epochs': num_epochs_cnn,
    'opt_func': opt_func.__name__,
    'batch_size': batch_size,
    'lr': lr_cnn,
})
[jovian] Hyperparams logged.
In [ ]:
%%time
history_cnn = fit(num_epochs_cnn, lr_cnn, model_cnn, train_dl, val_dl, opt_func)
Epoch [0], val_loss: 3.9782, val_acc: 0.0712 Epoch [1], val_loss: 3.5055, val_acc: 0.1513 Epoch [2], val_loss: 3.2247, val_acc: 0.2066 Epoch [3], val_loss: 2.9864, val_acc: 0.2552 Epoch [4], val_loss: 2.8120, val_acc: 0.2988 Epoch [5], val_loss: 2.7137, val_acc: 0.3208 Epoch [6], val_loss: 2.6248, val_acc: 0.3468 Epoch [7], val_loss: 2.6485, val_acc: 0.3487 Epoch [8], val_loss: 2.6681, val_acc: 0.3591 Epoch [9], val_loss: 2.8115, val_acc: 0.3616 Epoch [10], val_loss: 3.1148, val_acc: 0.3554 Epoch [11], val_loss: 3.6211, val_acc: 0.3427 Epoch [12], val_loss: 3.8963, val_acc: 0.3436 Epoch [13], val_loss: 4.4802, val_acc: 0.3405 Epoch [14], val_loss: 4.9442, val_acc: 0.3342 Epoch [15], val_loss: 5.0891, val_acc: 0.3386 Epoch [16], val_loss: 5.7842, val_acc: 0.3390 Epoch [17], val_loss: 6.1024, val_acc: 0.3406 Epoch [18], val_loss: 6.0822, val_acc: 0.3364 Epoch [19], val_loss: 6.5672, val_acc: 0.3332 CPU times: user 1min 27s, sys: 29.4 s, total: 1min 57s Wall time: 2min 40s
In [ ]:
plot_accuracies(history_cnn)
Notebook Image
In [ ]:
def plot_losses(history):
    train_losses = [x.get('train_loss') for x in history]
    val_losses = [x['val_loss'] for x in history]
    plt.plot(train_losses, '-bx')
    plt.plot(val_losses, '-rx')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.legend(['Training', 'Validation'])
    plt.title('Loss vs. No. of epochs');
In [ ]:
plot_losses(history_cnn)
Notebook Image
In [ ]:
jovian.commit(project=project_name)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Capturing environment.. [jovian] Attaching records (metrics, hyperparameters, dataset etc.) [jovian] Committed successfully!