Learn data science and machine learning by building real-world projects on Jovian

New Section

In [55]:
!pip install jovian --upgrade --quiet

Weather in Szeged 2006-2016

This is a dataset for a larger project I have been working on. My idea is to analyze and compare real historical weather with weather folklore.

Content The CSV file includes a hourly/daily summary for Szeged, Hungary area, between 2006 and 2016.

Data available in the hourly response:

  1. time
  2. summary
  3. precipType
  4. temperature
  5. apparentTemperature
  6. humidity
  7. windSpeed
  8. windBearing
  9. visibility
  10. loudCover
  11. pressure
In [56]:
import torch
import jovian
import torchvision
import torch.nn as nn
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torch.utils.data import DataLoader, TensorDataset, random_split
In [57]:
project_name='course-project-regression-pytorch' # will be used by jovian.commit

Step 1: Load the dataset

In [58]:
dataframe = pd.read_csv("weatherHistory.csv")
In [59]:
dataframe.head()
Out[59]:

Q: is there any missing value?

In [60]:
dataframe.isnull().sum()
Out[60]:
Formatted Date              0
Summary                     0
Precip Type                 0
Temperature (C)             0
Apparent Temperature (C)    0
Humidity                    0
Wind Speed (km/h)           0
Wind Bearing (degrees)      0
Visibility (km)             0
Loud Cover                  0
Pressure (millibars)        0
Daily Summary               1
dtype: int64
In [61]:
#treat the missing value
dataframe["Precip Type"].fillna(dataframe["Precip Type"].mode()[0], inplace=True)
dataframe["Daily Summary"].fillna(dataframe["Daily Summary"].mode()[0], inplace=True)
In [62]:
dataframe.isnull().sum()
Out[62]:
Formatted Date              0
Summary                     0
Precip Type                 0
Temperature (C)             0
Apparent Temperature (C)    0
Humidity                    0
Wind Speed (km/h)           0
Wind Bearing (degrees)      0
Visibility (km)             0
Loud Cover                  0
Pressure (millibars)        0
Daily Summary               0
dtype: int64

Q: How many columns doe the dataset have

In [63]:
num_cols = dataframe.shape[1]
print(num_cols)
12

Q: What are the column titles of the input variables?

In [64]:
input_cols = ["Temperature (C)","Humidity","Wind Speed (km/h)","Wind Bearing (degrees)","Visibility (km)","Pressure (millibars)"]
input_cols
Out[64]:
['Temperature (C)',
 'Humidity',
 'Wind Speed (km/h)',
 'Wind Bearing (degrees)',
 'Visibility (km)',
 'Pressure (millibars)']

Q: Which of the input columns are non-numeric or categorial variables?

In [65]:
categorical_cols = ["Summary", "Precip Type", "Daily Summary"]

Q: What are the column titles of output/target variable(s)?

In [66]:
output_cols = ["Apparent Temperature (C)"]
In [67]:
!pip install jovian --upgrade -q
In [68]:
import jovian
In [69]:
jovian.commit(project=project_name)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ai/dwiknrd/course-project-regression-pytorch

Step 2: Prepare the dataset for training

We need to convert the data from the Pandas dataframe into a PyTorch tensors for training. To do this, the first step is to convert it numpy arrays. If you've filled out input_cols, categorial_cols and output_cols correctly, this following function will perform the conversion to numpy arrays.

In [70]:
def dataframe_to_arrays(dataframe):
    # Make a copy of the original dataframe
    dataframe1 = dataframe.copy(deep=True)
    # Convert non-numeric categorical columns to numbers
    for col in categorical_cols:
        dataframe1[col] = dataframe1[col].astype('category').cat.codes
    # Extract input & outupts as numpy arrays
    inputs_array = dataframe1[input_cols].to_numpy()
    targets_array = dataframe1[output_cols].to_numpy()
    return inputs_array, targets_array
In [71]:
inputs_array, targets_array = dataframe_to_arrays(dataframe)
inputs_array, targets_array
Out[71]:
(array([[9.47222222e+00, 8.90000000e-01, 1.41197000e+01, 2.51000000e+02,
         1.58263000e+01, 1.01513000e+03],
        [9.35555556e+00, 8.60000000e-01, 1.42646000e+01, 2.59000000e+02,
         1.58263000e+01, 1.01563000e+03],
        [9.37777778e+00, 8.90000000e-01, 3.92840000e+00, 2.04000000e+02,
         1.49569000e+01, 1.01594000e+03],
        ...,
        [2.96666667e+00, 9.90000000e-01, 7.51870000e+00, 1.77000000e+02,
         4.74950000e+00, 1.01908000e+03],
        [5.92777778e+00, 8.70000000e-01, 1.13344000e+01, 2.43000000e+02,
         9.62780000e+00, 1.01953000e+03],
        [8.77222222e+00, 7.70000000e-01, 1.43612000e+01, 2.51000000e+02,
         9.98200000e+00, 1.00000000e+01]]), array([[7.38888889],
        [7.22777778],
        [9.37777778],
        ...,
        [0.90555556],
        [3.52222222],
        [6.5       ]]))

Q: Convert the numpy arrays inputs_array and targets_array into PyTorch tensors. Make sure that the data type is torch.float32.

In [72]:
inputs = torch.tensor(inputs_array, dtype=torch.float32)
targets = torch.tensor(targets_array, dtype=torch.float32)
In [73]:
inputs.dtype, targets.dtype
Out[73]:
(torch.float32, torch.float32)

Next, we need to create PyTorch datasets & data loaders for training & validation. We'll start by creating a TensorDataset

In [74]:
dataset = TensorDataset(inputs, targets)

Q: Pick a number between 0.1 and 0.2 to determine the fraction of data that will be used for creating the validation set. Then use random_split to create training & validation datasets.

In [75]:
num_rows = dataframe.shape[0]
print(num_rows)

num_cols = len(input_cols)
print(num_cols)
12300 6
In [76]:
val_percent = 0.15 # between 0.1 and 0.2
val_size = int(num_rows * val_percent)
train_size = num_rows - val_size


train_ds, val_ds = random_split(dataset, [train_size, val_size]) # Use the random_split function to split dataset into 2 parts of the desired length
In [77]:
batch_size = 128
In [78]:
train_loader = DataLoader(train_ds, batch_size, shuffle=True)
val_loader = DataLoader(val_ds, batch_size)
In [79]:
for xb, yb in train_loader:
    print("inputs:", xb)
    print("targets:", yb)
    break
inputs: tensor([[ 1.7861e+01, 6.3000e-01, 1.2606e+01, 1.1200e+02, 1.4957e+01, 1.0149e+03], [ 2.0344e+01, 5.6000e-01, 1.0803e+01, 1.0500e+02, 1.5826e+01, 1.0155e+03], [-1.0889e+00, 7.5000e-01, 2.1832e+01, 1.5900e+02, 9.9820e+00, 1.0116e+03], [ 1.2300e+01, 8.0000e-01, 1.1914e+01, 2.7100e+02, 9.9820e+00, 0.0000e+00], [ 1.2828e+01, 8.6000e-01, 1.3894e+01, 1.2000e+01, 1.4168e+01, 1.0098e+03], [-4.9500e+00, 8.4000e-01, 8.1627e+00, 2.9900e+02, 1.1125e+01, 1.0281e+03], [-7.9833e+00, 8.4000e-01, 3.3810e-01, 3.0000e+01, 3.0107e+00, 1.0327e+03], [-2.7556e+00, 8.8000e-01, 1.7308e+01, 2.9000e+02, 7.8246e+00, 1.0120e+03], [ 1.9106e+01, 4.6000e-01, 2.1365e+01, 2.9800e+02, 1.0352e+01, 1.0081e+03], [ 2.3794e+01, 5.0000e-01, 9.4668e+00, 3.4100e+02, 1.1206e+01, 1.0258e+03], [ 2.3956e+01, 6.2000e-01, 7.9212e+00, 2.1000e+02, 9.9820e+00, 1.0191e+03], [ 1.8272e+01, 7.3000e-01, 3.3649e+00, 3.4800e+02, 1.5826e+01, 1.0085e+03], [ 4.0833e+00, 7.1000e-01, 2.9141e+00, 2.0300e+02, 1.5826e+01, 1.0206e+03], [ 2.2222e+00, 9.2000e-01, 1.2751e+01, 2.0400e+02, 7.7441e+00, 1.0235e+03], [ 2.6111e-01, 8.9000e-01, 1.1495e+01, 3.2000e+02, 1.4168e+01, 1.0217e+03], [ 1.5606e+01, 7.8000e-01, 2.2347e+01, 1.8000e+01, 1.0352e+01, 1.0104e+03], [ 1.0578e+01, 9.3000e-01, 1.1173e+01, 2.7800e+02, 1.4909e+01, 1.0123e+03], [ 1.2756e+01, 9.6000e-01, 7.9695e+00, 1.0000e+00, 4.0250e+00, 1.0171e+03], [ 6.0611e+00, 9.5000e-01, 1.0755e+01, 1.5100e+02, 2.8819e+00, 1.0193e+03], [ 1.9878e+01, 5.4000e-01, 1.3476e+01, 1.2700e+02, 9.9820e+00, 1.0173e+03], [-9.4444e-02, 9.3000e-01, 4.7012e+00, 1.9900e+02, 8.0500e-01, 1.0303e+03], [ 2.2517e+01, 5.6000e-01, 2.9141e+00, 2.4800e+02, 1.1028e+01, 1.0197e+03], [-3.8889e+00, 8.8000e-01, 6.2790e+00, 3.0000e+02, 5.9570e+00, 1.0366e+03], [ 7.4389e+00, 9.2000e-01, 8.7262e+00, 8.7000e+01, 9.2092e+00, 1.0111e+03], [-1.2222e+00, 8.7000e-01, 7.2611e+00, 3.1000e+02, 9.9820e+00, 1.0338e+03], [ 3.2944e+00, 7.7000e-01, 1.7855e+01, 2.4700e+02, 9.9820e+00, 1.0119e+03], [ 6.1111e+00, 7.9000e-01, 1.1898e+01, 1.4100e+02, 9.5473e+00, 1.0187e+03], [ 1.9978e+01, 2.4000e-01, 1.3041e+00, 5.5000e+01, 1.0352e+01, 1.0167e+03], [ 7.4611e+00, 7.7000e-01, 1.6406e+01, 1.9800e+02, 1.0884e+01, 1.0219e+03], [-6.1167e+00, 9.6000e-01, 6.3756e+00, 1.6800e+02, 4.8300e-01, 1.0342e+03], [ 2.2211e+01, 6.9000e-01, 2.9141e+00, 2.1600e+02, 9.9820e+00, 1.0082e+03], [ 1.8828e+01, 7.3000e-01, 9.2253e+00, 3.2500e+02, 1.1206e+01, 1.0118e+03], [ 1.8750e+01, 5.7000e-01, 1.3685e+01, 1.3900e+02, 9.9820e+00, 1.0179e+03], [ 1.5133e+01, 6.5000e-01, 1.6454e+01, 1.0700e+02, 1.4957e+01, 1.0161e+03], [ 2.3839e+01, 5.0000e-01, 1.4136e+01, 2.9000e+01, 9.9820e+00, 1.0202e+03], [ 1.6133e+01, 3.9000e-01, 1.2655e+01, 1.4700e+02, 9.9820e+00, 1.0207e+03], [ 6.8111e+00, 9.6000e-01, 6.5044e+00, 3.4800e+02, 8.1627e+00, 1.0242e+03], [ 2.6206e+01, 6.8000e-01, 9.9498e+00, 1.3800e+02, 1.0046e+01, 1.0154e+03], [ 7.7778e+00, 8.8000e-01, 1.4410e+01, 2.8800e+02, 8.1305e+00, 1.0251e+03], [ 1.5411e+01, 6.3000e-01, 1.6792e+01, 1.0800e+02, 1.5826e+01, 1.0165e+03], [ 4.9222e+00, 7.7000e-01, 1.2477e+01, 2.0200e+02, 1.0884e+01, 1.0123e+03], [ 7.1667e-01, 9.6000e-01, 6.6010e+00, 2.6500e+02, 1.4876e+01, 1.0334e+03], [ 1.0067e+01, 7.7000e-01, 8.1627e+00, 1.7000e+02, 9.9820e+00, 1.0205e+03], [ 2.2578e+01, 8.5000e-01, 0.0000e+00, 0.0000e+00, 7.9051e+00, 1.0173e+03], [ 1.2756e+01, 7.7000e-01, 2.6130e+01, 2.0000e+02, 1.1270e+01, 1.0072e+03], [ 7.7167e+00, 9.6000e-01, 8.1305e+00, 2.0900e+02, 4.9427e+00, 1.0241e+03], [ 1.2222e+01, 7.3000e-01, 6.1502e+00, 1.4700e+02, 1.5730e+01, 1.0128e+03], [ 1.6889e+00, 9.2000e-01, 6.8908e+00, 2.9800e+02, 6.1985e+00, 1.0412e+03], [ 3.9889e+00, 7.5000e-01, 1.1318e+01, 3.1800e+02, 9.9820e+00, 1.0321e+03], [ 2.7339e+01, 6.4000e-01, 7.6636e+00, 1.7200e+02, 9.9820e+00, 1.0180e+03], [ 4.1833e+00, 8.9000e-01, 8.5330e-01, 3.5500e+02, 5.8443e+00, 1.0101e+03], [ 1.1611e+00, 9.2000e-01, 4.5241e+00, 3.2900e+02, 4.0733e+00, 1.0194e+03], [-1.2611e+00, 7.4000e-01, 6.3595e+00, 2.7000e+01, 6.4883e+00, 1.0321e+03], [ 1.5550e+01, 6.3000e-01, 1.1157e+01, 2.3000e+02, 1.1447e+01, 1.0162e+03], [-7.2222e-02, 8.2000e-01, 1.4087e+01, 3.3900e+02, 8.0500e+00, 1.0100e+03], [ 1.2728e+01, 9.5000e-01, 6.2468e+00, 2.4000e+02, 7.8085e+00, 1.0185e+03], [ 1.9950e+01, 5.3000e-01, 1.6309e+01, 1.4900e+02, 9.9820e+00, 0.0000e+00], [ 1.6889e+00, 9.2000e-01, 6.4078e+00, 3.3800e+02, 7.9695e+00, 1.0197e+03], [ 9.8444e+00, 6.6000e-01, 2.2218e+01, 3.1700e+02, 1.0723e+01, 1.0133e+03], [-1.5556e-01, 8.3000e-01, 1.0401e+01, 3.0900e+02, 1.1077e+01, 1.0225e+03], [ 1.0000e+01, 9.3000e-01, 6.4400e+00, 3.1000e+02, 9.9820e+00, 1.0238e+03], [-2.2056e+00, 7.2000e-01, 2.6130e+01, 3.1800e+02, 1.4957e+01, 1.0243e+03], [ 3.9556e+00, 8.8000e-01, 7.7119e+00, 1.7500e+02, 7.9695e+00, 1.0257e+03], [-3.3778e+00, 8.5000e-01, 1.1045e+01, 9.0000e+01, 4.0250e+00, 1.0311e+03], [ 2.2811e+01, 4.7000e-01, 1.2960e+01, 9.9000e+01, 1.0352e+01, 1.0293e+03], [ 4.0333e+00, 8.9000e-01, 9.2414e+00, 2.0900e+02, 9.9015e+00, 1.0105e+03], [ 1.4322e+01, 8.7000e-01, 6.0214e+00, 3.0200e+02, 1.4957e+01, 1.0095e+03], [ 2.8794e+01, 4.3000e-01, 7.5992e+00, 1.1800e+02, 1.1206e+01, 1.0081e+03], [ 2.1994e+01, 4.1000e-01, 8.7101e+00, 4.8000e+01, 9.9820e+00, 1.0215e+03], [ 2.2889e+00, 7.2000e-01, 8.1466e+00, 4.2000e+01, 1.1206e+01, 1.0353e+03], [ 1.6128e+01, 2.8000e-01, 1.1608e+01, 3.1000e+02, 1.0352e+01, 1.0187e+03], [ 1.1667e-01, 8.9000e-01, 1.1012e+01, 3.2900e+02, 1.4909e+01, 1.0220e+03], [ 9.8833e+00, 5.5000e-01, 9.5312e+00, 2.7100e+02, 9.9015e+00, 1.0086e+03], [ 1.7606e+01, 9.5000e-01, 2.9302e+00, 3.3800e+02, 1.4023e+01, 1.0118e+03], [ 2.1550e+01, 8.1000e-01, 6.3917e+00, 1.5200e+02, 1.5826e+01, 1.0127e+03], [ 4.4722e+00, 9.2000e-01, 6.6493e+00, 2.5100e+02, 5.9731e+00, 1.0077e+03], [ 2.3928e+01, 4.7000e-01, 1.1302e+01, 1.5300e+02, 1.1399e+01, 1.0194e+03], [ 1.6200e+01, 6.2000e-01, 1.6309e+01, 3.0900e+02, 1.1270e+01, 1.0130e+03], [ 1.9928e+01, 4.1000e-01, 1.5762e+01, 3.0000e+01, 9.9820e+00, 1.0168e+03], [-3.8889e+00, 8.5000e-01, 3.2039e+00, 2.5500e+02, 1.4651e+00, 1.0404e+03], [ 2.8278e+00, 8.9000e-01, 7.7119e+00, 2.4900e+02, 1.0948e+01, 1.0219e+03], [ 3.2022e+01, 4.3000e-01, 5.3130e+00, 9.2000e+01, 9.9820e+00, 1.0196e+03], [ 1.2711e+01, 9.6000e-01, 2.7370e+00, 3.2500e+02, 6.1180e+00, 1.0162e+03], [ 3.2756e+01, 2.9000e-01, 7.3416e+00, 2.7700e+02, 9.9820e+00, 1.0197e+03], [ 5.0000e+00, 8.9000e-01, 6.8264e+00, 1.3000e+01, 6.2951e+00, 1.0242e+03], [ 1.0167e+00, 9.2000e-01, 4.7012e+00, 1.3900e+02, 1.2880e+00, 1.0263e+03], [ 8.7944e+00, 9.6000e-01, 0.0000e+00, 0.0000e+00, 1.5730e+01, 1.0240e+03], [ 1.2261e+01, 6.0000e-01, 2.7998e+01, 2.3100e+02, 1.1206e+01, 1.0093e+03], [ 6.8056e+00, 8.1000e-01, 6.2307e+00, 3.3800e+02, 1.5826e+01, 1.0205e+03], [ 2.0972e+01, 3.8000e-01, 2.0205e+01, 2.9000e+02, 9.9820e+00, 1.0130e+03], [ 2.7017e+01, 4.8000e-01, 1.1077e+01, 3.3000e+02, 1.5826e+01, 1.0134e+03], [ 2.5922e+01, 5.8000e-01, 1.0707e+01, 2.9500e+02, 1.1447e+01, 1.0090e+03], [ 2.5722e+00, 8.3000e-01, 1.0835e+01, 2.3000e+02, 9.7566e+00, 1.0213e+03], [ 1.9067e+01, 3.3000e-01, 1.7388e+01, 3.0000e+01, 9.9820e+00, 1.0210e+03], [-1.3000e+01, 8.7000e-01, 3.0751e+00, 3.2000e+02, 9.8049e+00, 1.0398e+03], [ 2.1967e+01, 7.2000e-01, 3.0831e+01, 3.1000e+02, 1.5826e+01, 1.0156e+03], [ 7.2000e+00, 6.5000e-01, 1.7372e+01, 0.0000e+00, 9.9820e+00, 1.0150e+03], [ 7.5944e+00, 9.6000e-01, 1.0884e+01, 1.1400e+02, 4.3792e+00, 1.0204e+03], [ 6.0833e+00, 1.0000e+00, 8.2110e-01, 2.0000e+02, 1.6100e-01, 1.0117e+03], [-2.2222e+00, 7.0000e-01, 1.2477e+01, 3.0900e+02, 9.9820e+00, 1.0226e+03], [ 1.0367e+01, 7.8000e-01, 5.8926e+00, 1.6900e+02, 1.4909e+01, 1.0262e+03], [ 4.6778e+00, 9.1000e-01, 2.9946e+00, 1.7600e+02, 9.9820e+00, 1.0243e+03], [ 1.5528e+01, 9.3000e-01, 6.6010e+00, 3.4800e+02, 4.0250e+00, 1.0185e+03], [ 1.3328e+01, 3.4000e-01, 6.2146e+00, 3.5200e+02, 1.5826e+01, 1.0246e+03], [ 2.4950e+01, 4.2000e-01, 1.3540e+01, 2.9900e+02, 1.1270e+01, 1.0155e+03], [-1.6556e+00, 9.2000e-01, 1.4265e+01, 1.1900e+02, 5.7960e+00, 1.0378e+03], [ 1.1667e-01, 8.1000e-01, 1.3878e+01, 3.1100e+02, 7.2128e+00, 1.0102e+03], [-7.2222e-02, 5.7000e-01, 2.1912e+01, 3.0000e+02, 9.9820e+00, 1.0204e+03], [ 1.6667e+00, 8.2000e-01, 2.0254e+01, 1.4000e+02, 9.9015e+00, 1.0159e+03], [ 1.4328e+01, 7.1000e-01, 2.6082e+00, 2.6000e+01, 1.0352e+01, 1.0277e+03], [ 1.2056e+00, 7.8000e-01, 9.2253e+00, 1.8900e+02, 4.9266e+00, 1.0315e+03], [ 3.1133e+01, 4.1000e-01, 4.1699e+00, 8.9000e+01, 9.9820e+00, 1.0173e+03], [ 1.2906e+01, 3.8000e-01, 1.6261e+00, 2.0000e+00, 1.0352e+01, 1.0255e+03], [ 6.2056e+00, 7.0000e-01, 2.5873e+01, 2.8900e+02, 1.0046e+01, 1.0178e+03], [ 1.6111e-01, 8.5000e-01, 5.1359e+00, 1.8200e+02, 4.0733e+00, 1.0320e+03], [-1.0872e+01, 7.8000e-01, 6.2307e+00, 1.1000e+02, 9.7566e+00, 1.0273e+03], [-3.8722e+00, 8.6000e-01, 4.1377e+00, 2.7000e+02, 1.5134e+00, 1.0403e+03], [ 1.8867e+01, 8.4000e-01, 3.3649e+00, 2.9000e+02, 9.9820e+00, 1.0094e+03], [ 9.0667e+00, 8.7000e-01, 2.2234e+01, 3.1800e+02, 1.1399e+01, 1.0211e+03], [-1.6222e+00, 8.5000e-01, 1.1077e+01, 1.6200e+02, 9.9820e+00, 1.0100e+03], [ 1.1878e+01, 9.1000e-01, 6.3434e+00, 2.9600e+02, 8.4203e+00, 1.0182e+03], [ 1.5633e+01, 8.1000e-01, 2.3699e+01, 3.4800e+02, 1.0884e+01, 1.0040e+03], [ 2.3889e+01, 4.8000e-01, 1.2800e+01, 1.4800e+02, 1.0046e+01, 0.0000e+00], [-1.1611e+00, 9.6000e-01, 6.1824e+00, 1.2500e+02, 1.6100e-01, 1.0368e+03], [ 1.3817e+01, 5.0000e-01, 1.7469e+01, 2.8900e+02, 1.0352e+01, 1.0171e+03], [ 1.2739e+01, 7.3000e-01, 1.1125e+01, 3.0000e+01, 1.0272e+01, 1.0302e+03], [ 7.6778e+00, 8.3000e-01, 6.2951e+00, 2.2000e+01, 1.4458e+01, 1.0254e+03], [ 1.2761e+01, 7.8000e-01, 2.1091e+01, 3.4800e+02, 1.1270e+01, 1.0222e+03]]) targets: tensor([[ 17.8611], [ 20.3444], [ -6.8667], [ 12.3000], [ 12.8278], [ -8.5944], [ -7.9833], [ -8.2389], [ 19.1056], [ 23.7944], [ 23.9556], [ 18.2722], [ 4.0833], [ -1.2444], [ -3.3500], [ 15.6056], [ 10.5778], [ 12.7556], [ 3.7889], [ 19.8778], [ -0.0944], [ 22.5167], [ -6.6056], [ 5.8444], [ -3.9056], [ -0.7778], [ 3.6389], [ 19.9778], [ 4.6111], [ -9.2167], [ 22.2111], [ 18.8278], [ 18.7500], [ 15.1333], [ 23.8389], [ 16.1333], [ 5.6667], [ 26.2056], [ 5.2722], [ 15.4111], [ 2.0889], [ -1.4056], [ 10.0667], [ 22.5778], [ 12.7556], [ 6.3111], [ 12.2222], [ -0.3889], [ 1.1833], [ 28.9056], [ 4.1833], [ 1.1611], [ -3.6056], [ 15.5500], [ -4.3056], [ 12.7278], [ 19.9500], [ -0.2111], [ 6.9889], [ -3.5833], [ 9.3722], [ -8.8722], [ 2.0056], [ -7.6278], [ 22.8111], [ 1.7000], [ 14.3222], [ 28.7111], [ 21.9944], [ -0.0722], [ 16.1278], [ -3.4167], [ 8.5833], [ 17.6056], [ 21.5500], [ 2.9167], [ 23.9278], [ 16.2000], [ 19.9278], [ -3.8889], [ 0.6833], [ 32.7889], [ 12.7111], [ 31.6000], [ 3.4778], [ 1.0167], [ 8.7944], [ 12.2611], [ 5.7444], [ 20.9722], [ 27.2833], [ 25.9222], [ -0.4167], [ 19.0667], [-13.0000], [ 21.9667], [ 4.1667], [ 5.6056], [ 6.0833], [ -6.5833], [ 10.3667], [ 4.6778], [ 15.5278], [ 13.3278], [ 24.9500], [ -6.2889], [ -4.0333], [ -5.5778], [ -3.1556], [ 14.3278], [ -1.6556], [ 31.2944], [ 12.9056], [ 2.0000], [ -1.4556], [-14.6278], [ -3.8722], [ 18.8667], [ 6.0000], [ -5.5167], [ 11.8778], [ 15.6333], [ 23.8889], [ -3.4111], [ 13.8167], [ 12.7389], [ 6.7333], [ 12.7611]])
In [80]:
jovian.commit(project=project_name, environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Committed successfully! https://jovian.ai/dwiknrd/course-project-regression-pytorch

Step 3: Create a Linear Regression Model

In [91]:
input_size = len(input_cols)
output_size = len(output_cols)
input_size, output_size
Out[91]:
(6, 1)
In [92]:
class InsuranceModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(input_size, output_size)
        
    def forward(self, xb):
        out = self.linear(xb)                        
        return out
    
    def training_step(self, batch):
        inputs, targets = batch 
        # Generate predictions
        out = self(inputs)          
        # Calcuate loss
        loss = F.l1_loss(out, targets)
        return loss
    
    def validation_step(self, batch):
        inputs, targets = batch
        # Generate predictions
        out = self(inputs)
        # Calculate loss
        loss = F.l1_loss(out, targets)   
        return {'val_loss': loss.detach()}
        
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        return {'val_loss': epoch_loss.item()}
    
    def epoch_end(self, epoch, result, num_epochs):
        # Print result every 20th epoch
        if (epoch+1) % 20 == 0 or epoch == num_epochs-1:
            print("Epoch [{}], val_loss: {:.4f}".format(epoch+1, result['val_loss']))
In [93]:
model = InsuranceModel()
In [94]:
list(model.parameters())
Out[94]:
[Parameter containing:
 tensor([[ 0.3367,  0.1088, -0.3762,  0.0370, -0.0600,  0.1232]],
        requires_grad=True), Parameter containing:
 tensor([0.0329], requires_grad=True)]
In [95]:
jovian.commit(project=project_name, environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Committed successfully! https://jovian.ai/dwiknrd/course-project-regression-pytorch

Step 4: Train the model to fit the data

In [96]:
def evaluate(model, val_loader):
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)

def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
    history = []
    optimizer = opt_func(model.parameters(), lr)
    for epoch in range(epochs):
        # Training Phase 
        for batch in train_loader:
            loss = model.training_step(batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
        # Validation phase
        result = evaluate(model, val_loader)
        model.epoch_end(epoch, result,epochs)
        history.append(result)
    return history

Q: Use the evaluate function to calculate the loss on the validation set before training.

In [97]:
result = evaluate(model, val_loader) # Use the the evaluate function
print(result)
{'val_loss': 120.1018295288086}

Q: Train the model 4-5 times with different learning rates & for different number of epochs.

In [98]:
epochs = 50
lr = 1e-6
history1 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6.7760 Epoch [40], val_loss: 6.6536 Epoch [50], val_loss: 6.5956
In [99]:
epochs = 50
lr = 1e-6
history1 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6.4808 Epoch [40], val_loss: 6.3662 Epoch [50], val_loss: 6.3076
In [100]:
epochs = 50
lr = 1e-6
history3 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 6.1958 Epoch [40], val_loss: 6.0905 Epoch [50], val_loss: 6.0233
In [101]:
epochs = 50
lr = 1e-6
history4 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 5.9057 Epoch [40], val_loss: 5.7914 Epoch [50], val_loss: 5.7339
In [102]:
epochs = 50
lr = 1e-7
history5 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 5.7225 Epoch [40], val_loss: 5.7112 Epoch [50], val_loss: 5.7052
In [103]:
val_loss = evaluate(model, val_loader)
val_loss
Out[103]:
{'val_loss': 5.705215930938721}
In [104]:
jovian.log_metrics(val_loss=val_loss)
[jovian] Metrics logged.
In [105]:
jovian.commit(project=project_name, files=["weatherHistory.csv"], environment=None)
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... [jovian] Uploading additional files... [jovian] Attaching records (metrics, hyperparameters, dataset etc.) [jovian] Committed successfully! https://jovian.ai/dwiknrd/course-project-regression-pytorch

Step 5: Make predictions using the trained model

In [106]:
def predict_single(input, target, model):
    inputs = input.unsqueeze(0)
    predictions = model(inputs)
    prediction = predictions[0].detach()
    print("Input:", input)
    print("Target:", target)
    print("Prediction:", prediction)
In [107]:
input, target = val_ds[0]
predict_single(input, target, model)
Input: tensor([2.0306e+01, 4.4000e-01, 5.7960e-01, 1.2500e+02, 1.1399e+01, 1.0278e+03]) Target: tensor([20.3056]) Prediction: tensor([16.6850])
In [113]:
input, target = val_ds[68]
predict_single(input, target, model)
Input: tensor([1.8011e+01, 6.8000e-01, 3.2200e-01, 1.5700e+02, 9.9820e+00, 1.0148e+03]) Target: tensor([18.0111]) Prediction: tensor([15.9883])
In [ ]: