Learn practical skills, build real-world projects, and advance your career

Architure Style Classification Project

This kernel is about different architure style found across globe. This dataset is available on kaggle and the link is https://www.kaggle.com/wwymak/architecture-dataset.

!pip install jovian --upgrade --quiet
import jovian
jovian.commit(project='architecture-style-classification-project')
[jovian] Attempting to save notebook.. [jovian] Detected Kaggle notebook... [jovian] Please enter your API key ( from https://jovian.ml/ ): API KEY: ········ [jovian] Uploading notebook to https://jovian.ml/aarshibhatt112/architecture-style-classification-project
!pip install git+https://github.com/ufoym/imbalanced-dataset-sampler.git # GITHUB REPO FOR IMBALANCED DATASET SAMPLER 
import os
import torch
import torchvision
import tarfile
import torch.nn as nn
import numpy as np
import torch.nn.functional as F
import torchvision.models as models
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import torchvision.transforms as tt
from torch.utils.data import Dataset, random_split, Subset
from torchsampler import ImbalancedDatasetSampler
from torchvision.utils import make_grid
import matplotlib.pyplot as plt
%matplotlib inline
Collecting git+https://github.com/ufoym/imbalanced-dataset-sampler.git Cloning https://github.com/ufoym/imbalanced-dataset-sampler.git to /tmp/pip-req-build-dcd01xpk Running command git clone -q https://github.com/ufoym/imbalanced-dataset-sampler.git /tmp/pip-req-build-dcd01xpk Requirement already satisfied: torch in /opt/conda/lib/python3.7/site-packages (from torchsampler==0.1) (1.5.1) Requirement already satisfied: torchvision in /opt/conda/lib/python3.7/site-packages (from torchsampler==0.1) (0.6.0a0+35d732a) Requirement already satisfied: future in /opt/conda/lib/python3.7/site-packages (from torch->torchsampler==0.1) (0.18.2) Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from torch->torchsampler==0.1) (1.18.1) Requirement already satisfied: pillow>=4.1.1 in /opt/conda/lib/python3.7/site-packages (from torchvision->torchsampler==0.1) (5.4.1) Building wheels for collected packages: torchsampler Building wheel for torchsampler (setup.py) ... done Created wheel for torchsampler: filename=torchsampler-0.1-py3-none-any.whl size=3650 sha256=d30b65a4145dd0a478e41642592f1db9915e96bcbff6ef45756ac50cd58e693a Stored in directory: /tmp/pip-ephem-wheel-cache-p02opu4k/wheels/07/aa/b0/ae7a750c8c2b50cd3d3116fb1200d93a006eaa7c48048b862c Successfully built torchsampler Installing collected packages: torchsampler Successfully installed torchsampler-0.1

I have downloaded the data from above mentioned link and performed some pre-processing offline.
I split made 2 seperate folder i.e train and test dataset. I chose images at random for test and train set and renamed them zipped the file a .zip and uploaded the dataset on Kaggle (.zip files doesn't require to be 'unzipped' and extract the entire dataset was extracted to input folder without unzipping it)
You can chose to use my processed dataset at https://www.kaggle.com/aarshibhatt112/archiset. Or if you want to do you own preprocessing on the main dataset you use the code below.

NOTE: The input directory is read-only you can't make changes there also if you make a directory at same level as input you can view the directory structure on this right side tab. That's why processed the dataset on my pc.

# TO SPLIT THE DATA INTO TRAIN AND TEST SET

# source = "../Downloads/Dataset/train"
# dest = "../Downloads/Dataset/test"
# files = os.listdir(source)
# import shutil
# import random 
# for file in files:
#     for imgs in os.listdir(source + '/' + file):
#         if random.random() < 0.17:
#             shutil.move(source + '/'+ file + '/' + imgs, dest + '/' + file)
# print('done')

# TO RENAME THE FILES 

# for file in files:
#     for count, imgs in enumerate(os.listdir(source + '/' + file)):
#         dst ="img_" + str(count) + ".jpg"
#         src =source + '/' + file +'/' + imgs   #CURRENTLY POINT TO TRAIN SET 
#         dst = source + '/' + file +'/'+ dst
#         os.rename(src, dst) 
# print('done')

#TO GET THE DIRECTORY STRUCTURE ON KAGGLE THE PATH TO DATASET 

# for dirname, _, filenames in os.walk('/kaggle/input'):
#     for filename in filenames:
#         print(os.path.join(dirname, filename))