This project is the result of the knowledge acquired during the course Deep Learning with PyTorch: Zero to GANs offered by Jovian.ai.
For this project, was chosen the open Intel Image Classification Dataset which contains images of nature scenes sperated in 6 categories. The main goal of the project is to define, train and test a neural network model for classifying images.
Let's begin by installing and importing the required libraries.
# Uncomment and run the appropriate command for your operating system, if required
# Linux / Binder / Windows (No GPU)
# !pip install numpy matplotlib torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
# Linux / Windows (GPU)
# pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
# MacOS (NO GPU)
# !pip install numpy matplotlib torch torchvision torchaudio
!pip install opendatasets --upgrade
Requirement already up-to-date: opendatasets in /usr/local/lib/python3.6/dist-packages (0.1.10)
Requirement already satisfied, skipping upgrade: tqdm in /usr/local/lib/python3.6/dist-packages (from opendatasets) (4.41.1)
Requirement already satisfied, skipping upgrade: kaggle in /usr/local/lib/python3.6/dist-packages (from opendatasets) (1.5.10)
Requirement already satisfied, skipping upgrade: click in /usr/local/lib/python3.6/dist-packages (from opendatasets) (7.1.2)
Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2.23.0)
Requirement already satisfied, skipping upgrade: urllib3 in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (1.24.3)
Requirement already satisfied, skipping upgrade: certifi in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2020.12.5)
Requirement already satisfied, skipping upgrade: python-slugify in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (4.0.1)
Requirement already satisfied, skipping upgrade: python-dateutil in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (2.8.1)
Requirement already satisfied, skipping upgrade: six>=1.10 in /usr/local/lib/python3.6/dist-packages (from kaggle->opendatasets) (1.15.0)
Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->kaggle->opendatasets) (3.0.4)
Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->kaggle->opendatasets) (2.10)
Requirement already satisfied, skipping upgrade: text-unidecode>=1.3 in /usr/local/lib/python3.6/dist-packages (from python-slugify->kaggle->opendatasets) (1.3)
import os
import opendatasets as od
import numpy as np
import torch
import torchvision
from torch.utils.data import random_split
from torch.utils.data.dataloader import DataLoader
import torch.nn as nn
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
from torchvision.utils import make_grid
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
matplotlib.rcParams['figure.facecolor'] = '#ffffff'
project_name='zerotogans-project-intel-image-classification'
The Intel Nature Scenes Dataset (INS6) contains image data of natural scenes around the world. It is composed of images of size 150x150 distributed under 6 categories.
'buildings'
'forest'
'glacier'
'mountain'
'sea'
'street'
The Train, Test and Prediction data is separated in each zip files. There are around 14k images in Train, 3k in Test and 7k in Prediction.
dataset_url = 'https://www.kaggle.com/puneet6060/intel-image-classification'
od.download(dataset_url)
Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: rociocruzlinares
Your Kaggle Key: ··········
0%| | 0.00/346M [00:00<?, ?B/s]
Downloading intel-image-classification.zip to ./intel-image-classification
100%|██████████| 346M/346M [00:02<00:00, 136MB/s]
dataset_folder = './intel-image-classification'
tarin_folder = '/seg_train/seg_train'
test_folder = '/seg_test/seg_test'
pred_folder = '/seg_pred/seg_pred'
dataset = ImageFolder(dataset_folder+tarin_folder, transform=ToTensor())
test_ds = ImageFolder(dataset_folder+test_folder, transform=ToTensor())
classes = dataset.classes
def describe_dataset(data):
classes_count = { c:0 for c in classes}
for _,label in data:
classes_count[classes[label]]+=1
print(f'Classes: {len(classes)} - {classes} ')
print(f'Examples: {len(data)}')
print(f'Counts: {classes_count}')
plt.bar(classes_count.keys(),classes_count.values(), alpha=0.3 )
return classes_count
_ = describe_dataset(dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 14034
Counts: {'buildings': 2191, 'forest': 2271, 'glacier': 2404, 'mountain': 2512, 'sea': 2274, 'street': 2382}
All images must have the same shape to train the model correctly. This will be the first cleaning step, removing all those samples that don't have the right shape.
dataset = list(filter(lambda x: x[0].shape == torch.Size([3, 150, 150]), dataset))
test_ds = list(filter(lambda x: x[0].shape == torch.Size([3, 150, 150]), test_ds))
classes_count = describe_dataset(dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 13986
Counts: {'buildings': 2190, 'forest': 2263, 'glacier': 2387, 'mountain': 2495, 'sea': 2270, 'street': 2381}
As you can observe in the bar plot, all the classes don't have the same number of samples. So, at the second cleaning step, we will select a uniform number of samples for each class.
min_count = min(classes_count.values())
new_dataset = []
excedent_count = {k: v - min_count for k, v in classes_count.items()}
for item in dataset:
_, label = item
if excedent_count[classes[label]]>0:
excedent_count[classes[label]] -=1
else:
new_dataset.append(item)
dataset = new_dataset
classes_count = describe_dataset(new_dataset)
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 13140
Counts: {'buildings': 2190, 'forest': 2190, 'glacier': 2190, 'mountain': 2190, 'sea': 2190, 'street': 2190}
Finally, let's separate a portion of the dataset for validation.
random_seed = 42
torch.manual_seed(random_seed);
val_size = 3000
train_size = len(dataset) - val_size
train_ds, val_ds = random_split(dataset, [train_size, val_size])
print("\nTRAIN DATASET")
_=describe_dataset(train_ds)
print("\nVALIDATION DATASET")
_=describe_dataset(val_ds)
print("\nTEST DATASET")
_=describe_dataset(test_ds)
dataset = None
new_dataset = None
TRAIN DATASET
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 10140
Counts: {'buildings': 1708, 'forest': 1685, 'glacier': 1671, 'mountain': 1688, 'sea': 1713, 'street': 1675}
VALIDATION DATASET
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 3000
Counts: {'buildings': 482, 'forest': 505, 'glacier': 519, 'mountain': 502, 'sea': 477, 'street': 515}
TEST DATASET
Classes: 6 - ['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']
Examples: 2993
Counts: {'buildings': 437, 'forest': 473, 'glacier': 549, 'mountain': 523, 'sea': 510, 'street': 501}
jovian.log_dataset(dataset_url=dataset_url, val_size=val_size, random_seed=random_seed)
[jovian] Dataset logged.
Let's see some examples
print(train_ds[0][0].shape)
train_ds[0]
torch.Size([3, 150, 150])
(tensor([[[0.9843, 0.9843, 0.9804, ..., 0.0588, 0.0510, 0.0471],
[0.9882, 0.9843, 0.9843, ..., 0.0667, 0.0549, 0.0471],
[0.9882, 0.9882, 0.9882, ..., 0.0706, 0.0588, 0.0510],
...,
[0.4078, 0.4471, 0.4157, ..., 0.1098, 0.1020, 0.0902],
[0.3843, 0.4549, 0.4235, ..., 0.0980, 0.0941, 0.0824],
[0.4157, 0.4353, 0.3961, ..., 0.0902, 0.0863, 0.0745]],
[[0.9961, 0.9961, 0.9922, ..., 0.0667, 0.0549, 0.0510],
[1.0000, 0.9961, 0.9961, ..., 0.0745, 0.0588, 0.0510],
[1.0000, 1.0000, 0.9922, ..., 0.0784, 0.0627, 0.0549],
...,
[0.4039, 0.4471, 0.4353, ..., 0.1333, 0.1255, 0.1255],
[0.3804, 0.4549, 0.4431, ..., 0.1216, 0.1176, 0.1176],
[0.4118, 0.4353, 0.4157, ..., 0.1137, 0.1098, 0.1098]],
[[0.9686, 0.9686, 0.9647, ..., 0.0235, 0.0235, 0.0196],
[0.9725, 0.9686, 0.9686, ..., 0.0314, 0.0275, 0.0196],
[0.9725, 0.9725, 0.9686, ..., 0.0353, 0.0314, 0.0235],
...,
[0.3882, 0.4392, 0.4196, ..., 0.1333, 0.1255, 0.1216],
[0.3647, 0.4471, 0.4275, ..., 0.1216, 0.1176, 0.1137],
[0.3961, 0.4275, 0.4000, ..., 0.1137, 0.1098, 0.1059]]]), 5)
def show_example(img, label):
print('Label: ', classes[label], "("+str(label)+")")
plt.imshow(img.permute(1, 2, 0))
show_example(*train_ds[0])
Label: street (5)
show_example(*train_ds[10000])
Label: sea (4)
Next, let's create data loaders for retrieving images in batches. We'll use batch size of 256 to utlize a larger portion of the GPU RAM.
batch_size=256
train_dl = DataLoader(train_ds, batch_size, shuffle=True, num_workers=3, pin_memory=True)
val_dl = DataLoader(val_ds, batch_size*2, num_workers=3, pin_memory=True)
test_dl = DataLoader(test_ds, batch_size*2, num_workers=3, pin_memory=True)
Let's take a look at some sample images from the training dataloader.
def show_batch(dl):
for images, labels in dl:
fig, ax = plt.subplots(figsize=(24, 12))
ax.set_xticks([]); ax.set_yticks([])
ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))
break
show_batch(train_dl)