Jovian
⭐️
Sign In

Classifying Intel Natural Scenes Images using PyTorch

nature scene

This project is the result of the knowledge acquired during the course Deep Learning with PyTorch: Zero to GANs offered by Jovian.ai.

For this project, was chosen the open Intel Image Classification Dataset which contains images of nature scenes sperated in 6 categories. The main goal of the project is to define, train and test a neural network model for classifying images.

System Setup

Let's begin by installing and importing the required libraries.

In [10]:
# Uncomment and run the appropriate command for your operating system, if required

# Linux / Binder / Windows (No GPU)
# !pip install numpy matplotlib torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

# Linux / Windows (GPU)
# pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
 
# MacOS (NO GPU)
# !pip install numpy matplotlib torch torchvision torchaudio

!pip install opendatasets --upgrade
Requirement already up-to-date: opendatasets in /usr/local/lib/python3.6/dist-packages (0.1.10) Requirement already satisfied, skipping upgrade: tqdm in /usr/local/lib/python3.6/dist-packages (from opendatasets) (4.41.1) Requirement already satisfied, skipping upgrade: kaggle in /usr/local/lib/python3.6/dist-packages (from opendatasets) (1.5.10) Requirement already satisfied, skipping upgrade: click in /usr/local/lib/python3.6/dist-packages (from opendatasets) (7.1.2) Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2.23.0) Requirement already satisfied, skipping upgrade: urllib3 in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (1.24.3) Requirement already satisfied, skipping upgrade: certifi in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2020.12.5) Requirement already satisfied, skipping upgrade: python-slugify in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (4.0.1) Requirement already satisfied, skipping upgrade: python-dateutil in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2.8.1) Requirement already satisfied, skipping upgrade: six>=1.10 in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (1.15.0) Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->kaggle->opendatasets) (3.0.4) Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->kaggle->opendatasets) (2.10) Requirement already satisfied, skipping upgrade: text-unidecode>=1.3 in /usr/local/lib/python3.6/dist-packages (from python-slugify->kaggle->opendatasets) (1.3)
In [11]:
import os
import opendatasets as od
import numpy as np

import torch
import torchvision
from torch.utils.data import random_split
from torch.utils.data.dataloader import DataLoader
import torch.nn as nn
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
from torchvision.utils import make_grid

import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

matplotlib.rcParams['figure.facecolor'] = '#ffffff'
In [12]:
project_name='zerotogans-project-intel-image-classification'

Preparing the Data

The Intel Nature Scenes Dataset (INS6) contains image data of natural scenes around the world. It is composed of images of size 150x150 distributed under 6 categories.

  1. 'buildings'
  2. 'forest'
  3. 'glacier'
  4. 'mountain'
  5. 'sea'
  6. 'street'

The Train, Test and Prediction data is separated in each zip files. There are around 14k images in Train, 3k in Test and 7k in Prediction.

Downloading the dataset from Kaggle:

In [14]:
dataset_url = 'https://www.kaggle.com/puneet6060/intel-image-classification'
od.download(dataset_url)
Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds Your Kaggle username: rociocruzlinares Your Kaggle Key: ··········
0%| | 0.00/346M [00:00<?, ?B/s]
Downloading intel-image-classification.zip to ./intel-image-classification
100%|██████████| 346M/346M [00:02<00:00, 136MB/s]
In [15]:
dataset_folder = './intel-image-classification'
tarin_folder = '/seg_train/seg_train'
test_folder = '/seg_test/seg_test'
pred_folder = '/seg_pred/seg_pred'
In [16]:
dataset = ImageFolder(dataset_folder+tarin_folder, transform=ToTensor())
test_ds = ImageFolder(dataset_folder+test_folder, transform=ToTensor())
classes = dataset.classes
In [17]:
def describe_dataset(data):
    classes_count = { c:0 for c in classes}
    for _,label in data:
        classes_count[classes[label]]+=1

    print(f'Classes: {len(classes)} - {classes} ')
    print(f'Examples: {len(data)}')
    print(f'Counts: {classes_count}')
    plt.bar(classes_count.keys(),classes_count.values(), alpha=0.3 )

    return classes_count

_ = describe_dataset(dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 14034 Counts: {'buildings': 2191, 'forest': 2271, 'glacier': 2404, 'mountain': 2512, 'sea': 2274, 'street': 2382}
Notebook Image

Cleaning the dataset

All images must have the same shape to train the model correctly. This will be the first cleaning step, removing all those samples that don't have the right shape.

In [18]:
dataset = list(filter(lambda x: x[0].shape == torch.Size([3, 150, 150]), dataset))
test_ds = list(filter(lambda x: x[0].shape == torch.Size([3, 150, 150]), test_ds))

classes_count = describe_dataset(dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 13986 Counts: {'buildings': 2190, 'forest': 2263, 'glacier': 2387, 'mountain': 2495, 'sea': 2270, 'street': 2381}
Notebook Image

As you can observe in the bar plot, all the classes don't have the same number of samples. So, at the second cleaning step, we will select a uniform number of samples for each class.

In [19]:
min_count = min(classes_count.values())

new_dataset = []
excedent_count = {k: v - min_count for k, v in classes_count.items()}
for item in dataset:
    _, label = item
    if excedent_count[classes[label]]>0:
        excedent_count[classes[label]] -=1
    else:
        new_dataset.append(item)
dataset = new_dataset

classes_count = describe_dataset(new_dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 13140 Counts: {'buildings': 2190, 'forest': 2190, 'glacier': 2190, 'mountain': 2190, 'sea': 2190, 'street': 2190}
Notebook Image

Finally, let's separate a portion of the dataset for validation.

In [20]:
random_seed = 42
torch.manual_seed(random_seed);

val_size = 3000
train_size = len(dataset) - val_size

train_ds, val_ds = random_split(dataset, [train_size, val_size])

print("\nTRAIN DATASET")
_=describe_dataset(train_ds)

print("\nVALIDATION DATASET")
_=describe_dataset(val_ds)

print("\nTEST DATASET")
_=describe_dataset(test_ds)

dataset = None
new_dataset = None
TRAIN DATASET Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 10140 Counts: {'buildings': 1708, 'forest': 1685, 'glacier': 1671, 'mountain': 1688, 'sea': 1713, 'street': 1675} VALIDATION DATASET Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 3000 Counts: {'buildings': 482, 'forest': 505, 'glacier': 519, 'mountain': 502, 'sea': 477, 'street': 515} TEST DATASET Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street'] Examples: 2993 Counts: {'buildings': 437, 'forest': 473, 'glacier': 549, 'mountain': 523, 'sea': 510, 'street': 501}
Notebook Image
In [21]:
jovian.log_dataset(dataset_url=dataset_url, val_size=val_size, random_seed=random_seed)
[jovian] Dataset logged.

Exploring the dataset

Let's see some examples

In [22]:
print(train_ds[0][0].shape)
train_ds[0]
torch.Size([3, 150, 150])
Out[22]:
(tensor([[[0.9843, 0.9843, 0.9804,  ..., 0.0588, 0.0510, 0.0471],
          [0.9882, 0.9843, 0.9843,  ..., 0.0667, 0.0549, 0.0471],
          [0.9882, 0.9882, 0.9882,  ..., 0.0706, 0.0588, 0.0510],
          ...,
          [0.4078, 0.4471, 0.4157,  ..., 0.1098, 0.1020, 0.0902],
          [0.3843, 0.4549, 0.4235,  ..., 0.0980, 0.0941, 0.0824],
          [0.4157, 0.4353, 0.3961,  ..., 0.0902, 0.0863, 0.0745]],
 
         [[0.9961, 0.9961, 0.9922,  ..., 0.0667, 0.0549, 0.0510],
          [1.0000, 0.9961, 0.9961,  ..., 0.0745, 0.0588, 0.0510],
          [1.0000, 1.0000, 0.9922,  ..., 0.0784, 0.0627, 0.0549],
          ...,
          [0.4039, 0.4471, 0.4353,  ..., 0.1333, 0.1255, 0.1255],
          [0.3804, 0.4549, 0.4431,  ..., 0.1216, 0.1176, 0.1176],
          [0.4118, 0.4353, 0.4157,  ..., 0.1137, 0.1098, 0.1098]],
 
         [[0.9686, 0.9686, 0.9647,  ..., 0.0235, 0.0235, 0.0196],
          [0.9725, 0.9686, 0.9686,  ..., 0.0314, 0.0275, 0.0196],
          [0.9725, 0.9725, 0.9686,  ..., 0.0353, 0.0314, 0.0235],
          ...,
          [0.3882, 0.4392, 0.4196,  ..., 0.1333, 0.1255, 0.1216],
          [0.3647, 0.4471, 0.4275,  ..., 0.1216, 0.1176, 0.1137],
          [0.3961, 0.4275, 0.4000,  ..., 0.1137, 0.1098, 0.1059]]]), 5)
In [23]:
def show_example(img, label):
    print('Label: ', classes[label], "("+str(label)+")")
    plt.imshow(img.permute(1, 2, 0))
In [24]:
show_example(*train_ds[0])
Label: street (5)
Notebook Image
In [25]:
show_example(*train_ds[10000])
Label: sea (4)
Notebook Image

Next, let's create data loaders for retrieving images in batches. We'll use batch size of 256 to utlize a larger portion of the GPU RAM.

In [26]:
batch_size=256
In [27]:
train_dl = DataLoader(train_ds, batch_size, shuffle=True, num_workers=3, pin_memory=True)
val_dl = DataLoader(val_ds, batch_size*2, num_workers=3, pin_memory=True)
test_dl = DataLoader(test_ds, batch_size*2, num_workers=3, pin_memory=True)

Let's take a look at some sample images from the training dataloader.

In [28]:
def show_batch(dl):
    for images, labels in dl:
        fig, ax = plt.subplots(figsize=(24, 12))
        ax.set_xticks([]); ax.set_yticks([])
        ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))
        break

show_batch(train_dl)