In this assignment we're going to use information like a person's age, sex, BMI, no. of children and smoking habit to predict the price of yearly medical bills. This kind of model is useful for insurance companies to determine the yearly insurance premium for a person. The dataset for this problem is taken from: https://www.kaggle.com/mirichoi0218/insurance
We will create a model with the following steps:
This assignment builds upon the concepts from the first 2 lectures. It will help to review these Jupyter notebooks:
As you go through this notebook, you will find a ??? in certain places. Your job is to replace the ??? with appropriate code or values, to ensure that the notebook runs properly end-to-end . In some cases, you'll be required to choose some hyperparameters (learning rate, batch size etc.). Try to experiment with the hypeparameters to get the lowest loss.
# Uncomment and run the commands below if imports fail
# !conda install numpy pytorch torchvision cpuonly -c pytorch -y
# !pip install matplotlib --upgrade --quiet
!pip install jovian --upgrade --quiet
import torch
import jovian
import torchvision
import torch.nn as nn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torch.utils.data import DataLoader, TensorDataset, random_split
import random
project_name='02-insurance-linear-regression' # will be used by jovian.commit
Let us begin by downloading the data. We'll use the download_url
function from PyTorch to get the data as a CSV (comma-separated values) file.
DATASET_URL = "https://hub.jovian.ml/wp-content/uploads/2020/05/insurance.csv"
DATA_FILENAME = "insurance.csv"
download_url(DATASET_URL, '.')
Using downloaded and verified file: ./insurance.csv
To load the dataset into memory, we'll use the read_csv
function from the pandas
library. The data will be loaded as a Pandas dataframe. See this short tutorial to learn more: https://data36.com/pandas-tutorial-1-basics-reading-data-files-dataframes-data-selection/
dataframe_raw = pd.read_csv(DATA_FILENAME)
dataframe_raw.head()
We're going to do a slight customization of the data, so that you every participant receives a slightly different version of the dataset. Fill in your name below as a string (enter at least 5 characters)
your_name = 'Hitesh' # at least 5 characters
The customize_dataset
function will customize the dataset slightly using your name as a source of random numbers.
def customize_dataset(dataframe_raw, rand_str):
dataframe = dataframe_raw.copy(deep=True)
# drop some rows
dataframe = dataframe.sample(int(0.95*len(dataframe)), random_state=int(ord(rand_str[0])))
# scale input
dataframe.bmi = dataframe.bmi * ord(rand_str[1])/100.
# scale target
dataframe.charges = dataframe.charges * ord(rand_str[2])/100.
# drop column
if ord(rand_str[3]) % 2 == 1:
dataframe = dataframe.drop(['region'], axis=1)
return dataframe
dataframe = customize_dataset(dataframe_raw, your_name)
dataframe.head()
Let us answer some basic questions about the dataset.
Q: How many rows does the dataset have?
num_rows = len(dataframe.index)
print(num_rows)
1271
Q: How many columns doe the dataset have
num_cols = len(dataframe.columns)
print(num_cols)
6
Q: What are the column titles of the input variables?
input_cols = dataframe.columns.values[:-1]
Q: Which of the input columns are non-numeric or categorial variables ?
Hint: sex
is one of them. List the columns that are not numbers.
categorical_cols = dataframe.select_dtypes(exclude=[np.number]).columns.values
Q: What are the column titles of output/target variable(s)?
output_cols = dataframe.columns.values[-1:]
Q: (Optional) What is the minimum, maximum and average value of the charges
column? Can you show the distribution of values in a graph?
Use this data visualization cheatsheet for referece: https://jovian.ml/aakashns/dataviz-cheatsheet
# print("min=", dataframe['charges'].min())
# print("mean=", dataframe['charges'].mean())
# print("max=", dataframe['charges'].max())
# Histogram
lst_plot = [dataframe['charges'].describe()[i] for i in [3, 1, -1]]
# pd.Series(lst_plot).hist(grid=False, bins=15, figsize=(20,5))
print(lst_plot)
df_plot = pd.DataFrame(lst_plot,columns=['Values'])
df_plot.plot.bar()
[1301.373724, 15370.231757713336, 73973.6964916]
<matplotlib.axes._subplots.AxesSubplot at 0x1a32dcfd50>
Note: There are multiple ways to visualize the min, avg and max values. I have commented the second method in above cell to check it's output uncomment and run the previous cell. I have commented that method because it gave somewhat ambigous looking graph but you can try and refine it or play around with it.
Remember to commit your notebook to Jovian after every step, so that you don't lose your work.
jovian.commit(project=project_name, environment=None)
[jovian] Attempting to save notebook..
[jovian] Updating notebook "hiteshkumar-1mv17cs042/02-insurance-linear-regression" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Committed successfully! https://jovian.ml/hiteshkumar-1mv17cs042/02-insurance-linear-regression
We need to convert the data from the Pandas dataframe into a PyTorch tensors for training. To do this, the first step is to convert it numpy arrays. If you've filled out input_cols
, categorial_cols
and output_cols
correctly, this following function will perform the conversion to numpy arrays.
def dataframe_to_arrays(dataframe):
# Make a copy of the original dataframe
dataframe1 = dataframe.copy(deep=True)
# Convert non-numeric categorical columns to numbers
for col in categorical_cols:
dataframe1[col] = dataframe1[col].astype('category').cat.codes
# Extract input & outupts as numpy arrays
inputs_array = dataframe1[input_cols].to_numpy()
targets_array = dataframe1[output_cols].to_numpy()
return inputs_array, targets_array
Read through the Pandas documentation to understand how we're converting categorical variables into numbers.
inputs_array, targets_array = dataframe_to_arrays(dataframe)
inputs_array, targets_array
(array([[38. , 1. , 40.3095 , 3. , 1. ],
[47. , 0. , 27.93 , 2. , 0. ],
[19. , 1. , 34.755 , 0. , 0. ],
...,
[23. , 1. , 25.03725, 0. , 0. ],
[30. , 0. , 23.04225, 1. , 0. ],
[56. , 0. , 26.9325 , 0. , 0. ]]),
array([[48661.123156 ],
[11270.37556 ],
[26776.2281828],
...,
[ 2778.398998 ],
[ 5473.116118 ],
[13286.66494 ]]))
Q: Convert the numpy arrays inputs_array
and targets_array
into PyTorch tensors. Make sure that the data type is torch.float32
.
1st trial inputs = torch.from_numpy(inputs_array)
2nd trial inputs = torch.from_numpy(inputs_array, dtype=torch.float32)
Other failed trials
Note - inputs can be replaced with targets
Successful Alternative -
inputs = torch.from_numpy(inputs_array.astype(np.float32))
inputs = torch.tensor(inputs_array, dtype=torch.float32)
targets = torch.tensor(targets_array, dtype=torch.float32)
inputs.dtype, targets.dtype
(torch.float32, torch.float32)
Next, we need to create PyTorch datasets & data loaders for training & validation. We'll start by creating a TensorDataset
.
dataset = TensorDataset(inputs, targets)
**Q: Pick a number between 0.1
and 0.2
to determine the fraction of data that will be used for creating the validation set. Then use random_split
to create training & validation datasets. **
Failed attempts and errors I got-
val_percent = torch.rand([0.1, 0.2])
TypeError: rand(): argument 'size' must be tuple of ints, but found element of type float at pos 1
train_ds, val_ds = torch.utils.data.random_split(dataset, train_size)
val_ds = torch.utils.data.random_split(dataset, val_size)
TypeError: 'int' object is not iterable
val_percent = random.uniform(0.1, 0.2) # between 0.1 and 0.2
val_size = int(num_rows * val_percent)
train_size = num_rows - val_size
train_ds, val_ds = torch.utils.data.random_split(dataset, [train_size, val_size]) # Use the random_split function to split dataset into 2 parts of the desired length
Finally, we can create data loaders for training & validation.
Q: Pick a batch size for the data loader.
batch_size = random.randint(5,100)
train_loader = DataLoader(train_ds, batch_size, shuffle=True)
val_loader = DataLoader(val_ds, batch_size)
Let's look at a batch of data to verify everything is working fine so far.
for xb, yb in train_loader:
print("inputs:", xb)
print("targets:", yb)
break
inputs: tensor([[47.0000, 0.0000, 33.6000, 1.0000, 0.0000],
[33.0000, 1.0000, 44.5200, 5.0000, 0.0000],
[63.0000, 1.0000, 35.3430, 3.0000, 0.0000],
[21.0000, 0.0000, 35.3115, 2.0000, 0.0000],
[57.0000, 1.0000, 33.1170, 0.0000, 0.0000],
[19.0000, 1.0000, 31.7625, 0.0000, 1.0000],
[42.0000, 0.0000, 34.5135, 0.0000, 0.0000],
[50.0000, 0.0000, 29.5680, 3.0000, 0.0000],
[63.0000, 1.0000, 34.7550, 0.0000, 0.0000],
[20.0000, 0.0000, 25.6410, 0.0000, 1.0000],
[19.0000, 1.0000, 31.9200, 0.0000, 0.0000],
[19.0000, 0.0000, 32.0197, 0.0000, 0.0000],
[18.0000, 0.0000, 40.0785, 0.0000, 0.0000],
[61.0000, 0.0000, 34.9965, 4.0000, 0.0000],
[62.0000, 1.0000, 32.4188, 3.0000, 1.0000],
[30.0000, 0.0000, 24.0397, 1.0000, 0.0000],
[43.0000, 1.0000, 36.7080, 1.0000, 1.0000],
[18.0000, 0.0000, 42.1942, 0.0000, 0.0000],
[26.0000, 1.0000, 35.6107, 1.0000, 0.0000],
[56.0000, 0.0000, 35.5110, 2.0000, 0.0000],
[31.0000, 0.0000, 27.0270, 0.0000, 0.0000],
[31.0000, 1.0000, 30.0247, 1.0000, 0.0000],
[41.0000, 1.0000, 37.5375, 1.0000, 1.0000],
[27.0000, 0.0000, 22.5435, 0.0000, 0.0000],
[18.0000, 1.0000, 34.9965, 0.0000, 0.0000],
[49.0000, 1.0000, 39.3855, 2.0000, 0.0000],
[44.0000, 1.0000, 31.7100, 2.0000, 1.0000],
[47.0000, 0.0000, 27.4312, 1.0000, 1.0000],
[27.0000, 0.0000, 25.3050, 0.0000, 0.0000],
[61.0000, 0.0000, 23.1420, 0.0000, 0.0000],
[57.0000, 0.0000, 33.4162, 0.0000, 0.0000],
[49.0000, 1.0000, 37.6530, 0.0000, 0.0000],
[49.0000, 0.0000, 33.4950, 5.0000, 0.0000],
[20.0000, 0.0000, 30.2243, 0.0000, 0.0000],
[18.0000, 1.0000, 35.4585, 1.0000, 0.0000],
[63.0000, 0.0000, 22.7430, 0.0000, 0.0000],
[56.0000, 0.0000, 44.0055, 0.0000, 0.0000],
[52.0000, 0.0000, 26.1030, 0.0000, 0.0000],
[24.0000, 1.0000, 35.3115, 4.0000, 0.0000],
[49.0000, 0.0000, 25.0373, 3.0000, 1.0000],
[21.0000, 1.0000, 32.5710, 0.0000, 0.0000],
[27.0000, 0.0000, 21.0473, 3.0000, 1.0000],
[49.0000, 1.0000, 30.1245, 3.0000, 0.0000],
[22.0000, 0.0000, 24.3390, 0.0000, 0.0000],
[24.0000, 0.0000, 35.0122, 0.0000, 0.0000],
[31.0000, 1.0000, 31.3005, 0.0000, 1.0000],
[38.0000, 0.0000, 20.9475, 2.0000, 0.0000],
[22.0000, 1.0000, 39.5010, 1.0000, 1.0000],
[58.0000, 1.0000, 26.4338, 0.0000, 0.0000],
[45.0000, 0.0000, 37.0650, 0.0000, 0.0000],
[18.0000, 1.0000, 16.7580, 0.0000, 0.0000],
[20.0000, 1.0000, 29.4263, 1.0000, 1.0000],
[21.0000, 0.0000, 22.9425, 1.0000, 1.0000],
[36.0000, 0.0000, 23.7300, 2.0000, 1.0000],
[46.0000, 0.0000, 29.1060, 1.0000, 0.0000],
[42.0000, 1.0000, 32.8177, 0.0000, 0.0000],
[19.0000, 0.0000, 30.0300, 5.0000, 0.0000],
[53.0000, 0.0000, 26.0347, 1.0000, 0.0000],
[20.0000, 1.0000, 34.9965, 0.0000, 0.0000],
[29.0000, 0.0000, 40.7715, 3.0000, 0.0000],
[42.0000, 1.0000, 25.8720, 0.0000, 1.0000],
[47.0000, 1.0000, 37.8840, 1.0000, 1.0000],
[29.0000, 1.0000, 39.1545, 2.0000, 0.0000],
[51.0000, 0.0000, 22.6380, 1.0000, 0.0000],
[24.0000, 1.0000, 37.6530, 0.0000, 0.0000],
[27.0000, 0.0000, 37.8840, 0.0000, 1.0000],
[18.0000, 0.0000, 38.6925, 0.0000, 0.0000],
[21.0000, 1.0000, 27.0322, 2.0000, 0.0000],
[28.0000, 0.0000, 27.0900, 0.0000, 0.0000],
[27.0000, 1.0000, 34.2142, 3.0000, 0.0000],
[49.0000, 0.0000, 31.4212, 0.0000, 0.0000],
[60.0000, 1.0000, 30.0247, 0.0000, 0.0000],
[22.0000, 1.0000, 38.9235, 2.0000, 1.0000],
[56.0000, 1.0000, 41.5800, 0.0000, 0.0000],
[60.0000, 0.0000, 28.9275, 0.0000, 0.0000],
[48.0000, 1.0000, 36.0150, 3.0000, 0.0000],
[51.0000, 0.0000, 21.6300, 0.0000, 0.0000],
[18.0000, 0.0000, 40.5983, 2.0000, 0.0000],
[19.0000, 1.0000, 23.7405, 0.0000, 0.0000],
[18.0000, 0.0000, 32.9175, 0.0000, 0.0000],
[48.0000, 1.0000, 31.0800, 0.0000, 0.0000],
[25.0000, 1.0000, 26.2342, 2.0000, 0.0000],
[29.0000, 1.0000, 40.8870, 1.0000, 0.0000],
[49.0000, 1.0000, 27.1320, 1.0000, 0.0000],
[19.0000, 0.0000, 29.7150, 0.0000, 1.0000],
[41.0000, 1.0000, 42.2730, 0.0000, 0.0000],
[57.0000, 1.0000, 30.4237, 0.0000, 1.0000]])
targets: tensor([[ 9919.5625],
[ 7732.8418],
[17587.3809],
[ 4152.6011],
[13169.7441],
[37756.0742],
[ 8178.0249],
[12415.0654],
[15536.7568],
[30305.7832],
[ 1457.3069],
[ 2468.9800],
[ 1892.7352],
[42433.1289],
[54193.0703],
[ 5474.6479],
[47599.6953],
[ 2572.2642],
[ 3819.3347],
[14666.3184],
[ 4357.6812],
[ 4922.5645],
[46717.4297],
[ 3890.0256],
[ 1317.6912],
[10793.4541],
[45238.3125],
[27145.5156],
[ 3449.9861],
[15794.9756],
[13737.4434],
[ 9424.3135],
[13401.3682],
[ 2850.3650],
[ 2001.6406],
[16761.8320],
[12868.6025],
[31456.8730],
[19868.9746],
[27964.0195],
[19240.3379],
[19047.7734],
[11906.7529],
[ 3169.0181],
[ 3312.3076],
[22446.4277],
[ 8275.3271],
[43111.5898],
[13840.1055],
[ 8523.8447],
[ 1965.9639],
[20370.0410],
[17816.5605],
[21585.5840],
[ 9549.8613],
[ 7376.1807],
[ 5437.8447],
[12692.8730],
[ 1614.1733],
[ 5960.3779],
[22638.0273],
[48964.9219],
[ 4707.4146],
[11431.9521],
[ 2304.8428],
[43075.3203],
[ 1890.6068],
[ 3804.6475],
[ 3667.2866],
[ 5622.4272],
[10426.2637],
[35101.5938],
[43481.9609],
[12297.6377],
[15331.8301],
[11093.1133],
[10747.1641],
[ 3936.2935],
[ 1889.0262],
[ 1881.7386],
[24629.3320],
[26960.1113],
[ 4026.8352],
[10767.6777],
[19814.0527],
[ 6622.6309],
[31573.3867]])
Let's save our work by committing to Jovian.
jovian.commit(project=project_name, environment=None)
[jovian] Attempting to save notebook..
[jovian] Updating notebook "hiteshkumar-1mv17cs042/02-insurance-linear-regression" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Attaching records (metrics, hyperparameters, dataset etc.)
[jovian] Committed successfully! https://jovian.ml/hiteshkumar-1mv17cs042/02-insurance-linear-regression
Our model itself is a fairly straightforward linear regression (we'll build more complex models in the next assignment).
input_size = len(input_cols)
output_size = len(output_cols)
Q: Complete the class definition below by filling out the constructor (__init__
), forward
, training_step
and validation_step
methods.
Hint: Think carefully about picking a good loss fuction (it's not cross entropy). Maybe try 2-3 of them and see which one works best. See https://pytorch.org/docs/stable/nn.functional.html#loss-functions
class InsuranceModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(input_size, output_size) # fill this (hint: use input_size & output_size defined above)
def forward(self, xb):
out = self.linear(xb) # fill this
return out
def training_step(self, batch):
inputs, targets = batch
# Generate predictions
out = self(inputs)
# Calcuate loss
loss = F.smooth_l1_loss(out, targets) # loss on first run = {'val_loss': 15884.009765625}
return loss
def validation_step(self, batch):
inputs, targets = batch
# Generate predictions
out = self(inputs)
# Calculate loss
loss = F.smooth_l1_loss(out, targets) # fill this
return {'val_loss': loss.detach()}
def validation_epoch_end(self, outputs):
batch_losses = [x['val_loss'] for x in outputs]
epoch_loss = torch.stack(batch_losses).mean() # Combine losses
return {'val_loss': epoch_loss.item()}
def epoch_end(self, epoch, result, num_epochs):
# Print result every 20th epoch
if (epoch+1) % 20 == 0 or epoch == num_epochs-1:
print("Epoch [{}], val_loss: {:.4f}".format(epoch+1, result['val_loss']))
loss = F.l1_loss(out, targets) # {'val_loss': 15876.384765625}
loss= F.mse_loss(out, targets) # {'val_loss': 462317440.0}
loss = F.binary_cross_entropy_with_logits(out, targets) # {'val_loss': 6555.9228515625}
loss = F.poisson_nll_loss(out, targets) # {'val_loss': 29691.9609375}
loss = F.hinge_embedding_loss(out, targets) # {'val_loss': 1.0}
loss= F.multilabel_soft_margin_loss(out, targets) # {'val_loss': -160699.9375}
loss = F.soft_margin_loss(out, targets) # {'val_loss': inf}
loss = F.kl_div(out, targets) # {'val_loss': 472210.84375}
UserWarning: reduction: 'mean' divides the total loss by both the batch size and the support size.'batchmean' divides only by the batch size, and aligns with the KL div math definition.'mean' will be changed to behave the same as 'batchmean' in the next major release. warnings.warn("reduction: 'mean' divides the total loss by both the batch size and the support size."
loss = F.binary_cross_entropy(out, targets) # RuntimeError: all elements of input should be between 0 and 1
loss = F.cross_entropy(out, targets) # RuntimeError: 1D target tensor expected, multi-target not supported
loss = F.nll_loss(out, targets) # RuntimeError: 1D target tensor expected, multi-target not supported
Let us create a model using the InsuranceModel
class. You may need to come back later and re-run the next cell to reinitialize the model, in case the loss becomes nan
or infinity
.
model = InsuranceModel()
Let's check out the weights and biases of the model using model.parameters
.
list(model.parameters())
[Parameter containing:
tensor([[-0.1668, 0.0799, -0.4268, 0.4045, 0.3948]], requires_grad=True),
Parameter containing:
tensor([0.1343], requires_grad=True)]
One final commit before we train the model.
jovian.commit(project=project_name, environment=None)
[jovian] Attempting to save notebook..
[jovian] Updating notebook "hiteshkumar-1mv17cs042/02-insurance-linear-regression" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Attaching records (metrics, hyperparameters, dataset etc.)
[jovian] Committed successfully! https://jovian.ml/hiteshkumar-1mv17cs042/02-insurance-linear-regression
To train our model, we'll use the same fit
function explained in the lecture. That's the benefit of defining a generic training loop - you can use it for any problem.
def evaluate(model, val_loader):
outputs = [model.validation_step(batch) for batch in val_loader]
return model.validation_epoch_end(outputs)
def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD):
history = []
optimizer = opt_func(model.parameters(), lr)
for epoch in range(epochs):
# Training Phase
for batch in train_loader:
loss = model.training_step(batch)
loss.backward()
optimizer.step()
optimizer.zero_grad()
# Validation phase
result = evaluate(model, val_loader)
model.epoch_end(epoch, result, epochs)
history.append(result)
return history
Q: Use the evaluate
function to calculate the loss on the validation set before training.
result = evaluate(model, val_loader) # Use the the evaluate function
print(result)
{'val_loss': 15042.2607421875}
We are now ready to train the model. You may need to run the training loop many times, for different number of epochs and with different learning rates, to get a good result. Also, if your loss becomes too large (or nan
), you may have to re-initialize the model by running the cell model = InsuranceModel()
. Experiment with this for a while, and try to get to as low a loss as possible.
Q: Train the model 4-5 times with different learning rates & for different number of epochs.
Hint: Vary learning rates by orders of 10 (e.g. 1e-2
, 1e-3
, 1e-4
, 1e-5
, 1e-6
) to figure out what works.
epochs = 2000
lr = 1e-1
history1 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7999.7407
Epoch [40], val_loss: 7783.5352
Epoch [60], val_loss: 7735.3999
Epoch [80], val_loss: 7735.4897
Epoch [100], val_loss: 7736.3774
Epoch [120], val_loss: 7728.5171
Epoch [140], val_loss: 7728.7300
Epoch [160], val_loss: 7725.4399
Epoch [180], val_loss: 7720.0679
Epoch [200], val_loss: 7718.6460
Epoch [220], val_loss: 7717.1860
Epoch [240], val_loss: 7711.8574
Epoch [260], val_loss: 7708.7812
Epoch [280], val_loss: 7707.1665
Epoch [300], val_loss: 7700.9595
Epoch [320], val_loss: 7699.8286
Epoch [340], val_loss: 7694.5991
Epoch [360], val_loss: 7688.9160
Epoch [380], val_loss: 7688.8359
Epoch [400], val_loss: 7686.6997
Epoch [420], val_loss: 7680.9194
Epoch [440], val_loss: 7678.4761
Epoch [460], val_loss: 7675.0259
Epoch [480], val_loss: 7675.1616
Epoch [500], val_loss: 7669.3452
Epoch [520], val_loss: 7667.7192
Epoch [540], val_loss: 7664.6836
Epoch [560], val_loss: 7664.3789
Epoch [580], val_loss: 7660.0571
Epoch [600], val_loss: 7660.0293
Epoch [620], val_loss: 7655.3203
Epoch [640], val_loss: 7653.0386
Epoch [660], val_loss: 7650.8110
Epoch [680], val_loss: 7648.5781
Epoch [700], val_loss: 7646.5132
Epoch [720], val_loss: 7645.7612
Epoch [740], val_loss: 7644.6958
Epoch [760], val_loss: 7641.5195
Epoch [780], val_loss: 7639.8008
Epoch [800], val_loss: 7638.7134
Epoch [820], val_loss: 7636.5586
Epoch [840], val_loss: 7636.0430
Epoch [860], val_loss: 7633.6113
Epoch [880], val_loss: 7632.4609
Epoch [900], val_loss: 7630.3262
Epoch [920], val_loss: 7628.9858
Epoch [940], val_loss: 7630.1616
Epoch [960], val_loss: 7625.5181
Epoch [980], val_loss: 7623.9995
Epoch [1000], val_loss: 7622.5288
Epoch [1020], val_loss: 7620.9243
Epoch [1040], val_loss: 7619.2090
Epoch [1060], val_loss: 7618.2632
Epoch [1080], val_loss: 7616.2285
Epoch [1100], val_loss: 7614.8569
Epoch [1120], val_loss: 7613.1655
Epoch [1140], val_loss: 7611.6523
Epoch [1160], val_loss: 7612.5688
Epoch [1180], val_loss: 7608.9492
Epoch [1200], val_loss: 7606.8853
Epoch [1220], val_loss: 7605.3970
Epoch [1240], val_loss: 7604.8008
Epoch [1260], val_loss: 7602.7173
Epoch [1280], val_loss: 7601.0962
Epoch [1300], val_loss: 7599.6172
Epoch [1320], val_loss: 7598.1431
Epoch [1340], val_loss: 7602.3750
Epoch [1360], val_loss: 7598.1543
Epoch [1380], val_loss: 7594.9253
Epoch [1400], val_loss: 7595.6128
Epoch [1420], val_loss: 7593.8438
Epoch [1440], val_loss: 7590.3926
Epoch [1460], val_loss: 7591.4038
Epoch [1480], val_loss: 7588.2017
Epoch [1500], val_loss: 7587.4702
Epoch [1520], val_loss: 7584.6133
Epoch [1540], val_loss: 7584.9668
Epoch [1560], val_loss: 7582.3892
Epoch [1580], val_loss: 7585.9985
Epoch [1600], val_loss: 7581.1030
Epoch [1620], val_loss: 7580.7954
Epoch [1640], val_loss: 7580.5239
Epoch [1660], val_loss: 7576.8921
Epoch [1680], val_loss: 7574.6772
Epoch [1700], val_loss: 7577.2051
Epoch [1720], val_loss: 7572.4258
Epoch [1740], val_loss: 7572.6714
Epoch [1760], val_loss: 7571.6758
Epoch [1780], val_loss: 7572.8984
Epoch [1800], val_loss: 7567.7305
Epoch [1820], val_loss: 7568.3638
Epoch [1840], val_loss: 7565.3066
Epoch [1860], val_loss: 7564.7261
Epoch [1880], val_loss: 7567.5742
Epoch [1900], val_loss: 7561.0688
Epoch [1920], val_loss: 7562.1699
Epoch [1940], val_loss: 7562.6152
Epoch [1960], val_loss: 7560.4595
Epoch [1980], val_loss: 7558.0659
Epoch [2000], val_loss: 7558.2485
epochs = 1500
lr = 1e-2
history2 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7557.1626
Epoch [40], val_loss: 7556.8228
Epoch [60], val_loss: 7556.9478
Epoch [80], val_loss: 7556.3218
Epoch [100], val_loss: 7556.2183
Epoch [120], val_loss: 7556.5522
Epoch [140], val_loss: 7556.6851
Epoch [160], val_loss: 7555.9819
Epoch [180], val_loss: 7556.2319
Epoch [200], val_loss: 7556.0327
Epoch [220], val_loss: 7555.6055
Epoch [240], val_loss: 7555.7905
Epoch [260], val_loss: 7555.0327
Epoch [280], val_loss: 7555.3047
Epoch [300], val_loss: 7555.0679
Epoch [320], val_loss: 7554.8423
Epoch [340], val_loss: 7554.9390
Epoch [360], val_loss: 7554.8257
Epoch [380], val_loss: 7554.4492
Epoch [400], val_loss: 7554.5376
Epoch [420], val_loss: 7554.2759
Epoch [440], val_loss: 7554.4038
Epoch [460], val_loss: 7554.3843
Epoch [480], val_loss: 7554.2358
Epoch [500], val_loss: 7554.2329
Epoch [520], val_loss: 7553.9585
Epoch [540], val_loss: 7553.6680
Epoch [560], val_loss: 7553.5039
Epoch [580], val_loss: 7553.6719
Epoch [600], val_loss: 7553.4458
Epoch [620], val_loss: 7553.4976
Epoch [640], val_loss: 7553.1040
Epoch [660], val_loss: 7553.5024
Epoch [680], val_loss: 7552.6914
Epoch [700], val_loss: 7553.3687
Epoch [720], val_loss: 7552.8774
Epoch [740], val_loss: 7552.0864
Epoch [760], val_loss: 7552.4253
Epoch [780], val_loss: 7552.0718
Epoch [800], val_loss: 7552.2227
Epoch [820], val_loss: 7552.1538
Epoch [840], val_loss: 7551.8853
Epoch [860], val_loss: 7552.0703
Epoch [880], val_loss: 7551.5933
Epoch [900], val_loss: 7552.1694
Epoch [920], val_loss: 7551.8647
Epoch [940], val_loss: 7551.6211
Epoch [960], val_loss: 7551.7817
Epoch [980], val_loss: 7551.0825
Epoch [1000], val_loss: 7551.3125
Epoch [1020], val_loss: 7550.8833
Epoch [1040], val_loss: 7550.5879
Epoch [1060], val_loss: 7551.3169
Epoch [1080], val_loss: 7550.7290
Epoch [1100], val_loss: 7550.4868
Epoch [1120], val_loss: 7550.2124
Epoch [1140], val_loss: 7550.5454
Epoch [1160], val_loss: 7550.2681
Epoch [1180], val_loss: 7550.2676
Epoch [1200], val_loss: 7549.9077
Epoch [1220], val_loss: 7549.9805
Epoch [1240], val_loss: 7549.7656
Epoch [1260], val_loss: 7549.4985
Epoch [1280], val_loss: 7549.5078
Epoch [1300], val_loss: 7549.7808
Epoch [1320], val_loss: 7549.3267
Epoch [1340], val_loss: 7549.5649
Epoch [1360], val_loss: 7549.1016
Epoch [1380], val_loss: 7548.4507
Epoch [1400], val_loss: 7548.8618
Epoch [1420], val_loss: 7548.5688
Epoch [1440], val_loss: 7548.5103
Epoch [1460], val_loss: 7548.2368
Epoch [1480], val_loss: 7548.2925
Epoch [1500], val_loss: 7547.9688
epochs = 1000
lr = 1e-3
history3 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7548.1641
Epoch [40], val_loss: 7548.0391
Epoch [60], val_loss: 7548.0776
Epoch [80], val_loss: 7548.2368
Epoch [100], val_loss: 7548.2539
Epoch [120], val_loss: 7548.2407
Epoch [140], val_loss: 7548.3188
Epoch [160], val_loss: 7548.2515
Epoch [180], val_loss: 7548.2202
Epoch [200], val_loss: 7548.1841
Epoch [220], val_loss: 7548.1797
Epoch [240], val_loss: 7548.1255
Epoch [260], val_loss: 7548.0952
Epoch [280], val_loss: 7548.1309
Epoch [300], val_loss: 7548.1157
Epoch [320], val_loss: 7548.2095
Epoch [340], val_loss: 7548.1035
Epoch [360], val_loss: 7548.0054
Epoch [380], val_loss: 7548.0078
Epoch [400], val_loss: 7548.0591
Epoch [420], val_loss: 7548.0220
Epoch [440], val_loss: 7548.0864
Epoch [460], val_loss: 7548.0757
Epoch [480], val_loss: 7548.0767
Epoch [500], val_loss: 7547.8906
Epoch [520], val_loss: 7547.9517
Epoch [540], val_loss: 7547.9702
Epoch [560], val_loss: 7548.0078
Epoch [580], val_loss: 7548.0288
Epoch [600], val_loss: 7547.9702
Epoch [620], val_loss: 7547.9468
Epoch [640], val_loss: 7547.8501
Epoch [660], val_loss: 7547.8359
Epoch [680], val_loss: 7547.8257
Epoch [700], val_loss: 7547.7954
Epoch [720], val_loss: 7547.8325
Epoch [740], val_loss: 7547.8179
Epoch [760], val_loss: 7547.7109
Epoch [780], val_loss: 7547.6836
Epoch [800], val_loss: 7547.7866
Epoch [820], val_loss: 7547.9634
Epoch [840], val_loss: 7547.8398
Epoch [860], val_loss: 7547.8359
Epoch [880], val_loss: 7547.7876
Epoch [900], val_loss: 7547.7593
Epoch [920], val_loss: 7547.7695
Epoch [940], val_loss: 7547.8203
Epoch [960], val_loss: 7547.7227
Epoch [980], val_loss: 7547.7524
Epoch [1000], val_loss: 7547.6782
epochs = 500
lr = 1e-4
history4 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7547.6743
Epoch [40], val_loss: 7547.6602
Epoch [60], val_loss: 7547.6587
Epoch [80], val_loss: 7547.6655
Epoch [100], val_loss: 7547.6655
Epoch [120], val_loss: 7547.6733
Epoch [140], val_loss: 7547.6655
Epoch [160], val_loss: 7547.6602
Epoch [180], val_loss: 7547.6582
Epoch [200], val_loss: 7547.6587
Epoch [220], val_loss: 7547.6597
Epoch [240], val_loss: 7547.6562
Epoch [260], val_loss: 7547.6470
Epoch [280], val_loss: 7547.6504
Epoch [300], val_loss: 7547.6504
Epoch [320], val_loss: 7547.6470
Epoch [340], val_loss: 7547.6484
Epoch [360], val_loss: 7547.6538
Epoch [380], val_loss: 7547.6440
Epoch [400], val_loss: 7547.6470
Epoch [420], val_loss: 7547.6445
Epoch [440], val_loss: 7547.6421
Epoch [460], val_loss: 7547.6548
Epoch [480], val_loss: 7547.6655
Epoch [500], val_loss: 7547.6641
epochs = 250
lr = 1e-5
history5 = fit(epochs, lr, model, train_loader, val_loader)
Epoch [20], val_loss: 7547.6646
Epoch [40], val_loss: 7547.6641
Epoch [60], val_loss: 7547.6626
Epoch [80], val_loss: 7547.6621
Epoch [100], val_loss: 7547.6655
Epoch [120], val_loss: 7547.6626
Epoch [140], val_loss: 7547.6626
Epoch [160], val_loss: 7547.6626
Epoch [180], val_loss: 7547.6655
Epoch [200], val_loss: 7547.6655
Epoch [220], val_loss: 7547.6660
Epoch [240], val_loss: 7547.6641
Epoch [250], val_loss: 7547.6641
no. of epochs and learning rate should be higher for better efficiency and less loss value
Q: What is the final validation loss of your model?
val_loss = history5[-1]['val_loss']
Let's log the final validation loss to Jovian and commit the notebook
jovian.log_metrics(val_loss=val_loss)
[jovian] Metrics logged.
jovian.commit(project=project_name, environment=None)
[jovian] Attempting to save notebook..
[jovian] Updating notebook "hiteshkumar-1mv17cs042/02-insurance-linear-regression" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Attaching records (metrics, hyperparameters, dataset etc.)
[jovian] Committed successfully! https://jovian.ml/hiteshkumar-1mv17cs042/02-insurance-linear-regression
Now scroll back up, re-initialize the model, and try different set of values for batch size, number of epochs, learning rate etc. Commit each experiment and use the "Compare" and "View Diff" options on Jovian to compare the different results.
Q: Complete the following function definition to make predictions on a single input
def predict_single(input, target, model):
inputs = input.unsqueeze(0)
predictions = model(inputs) # fill this
prediction = predictions[0].detach()
print("Input:", input)
print("Target:", target)
print("Prediction:", prediction)
input, target = val_ds[0]
predict_single(input, target, model)
Input: tensor([18.0000, 1.0000, 40.0785, 0.0000, 1.0000])
Target: tensor([42117.0469])
Prediction: tensor([1815.6129])
input, target = val_ds[10]
predict_single(input, target, model)
Input: tensor([32.0000, 0.0000, 33.1170, 1.0000, 0.0000])
Target: tensor([5972.3208])
Prediction: tensor([6553.7280])
input, target = val_ds[23]
predict_single(input, target, model)
Input: tensor([20.0000, 0.0000, 28.1820, 1.0000, 1.0000])
Target: tensor([19818.9102])
Prediction: tensor([4062.4272])
Are you happy with your model's predictions? Try to improve them further.
While this last step is optional for the submission of your assignment, we highly recommend that you do it. Try to clean up & replicate this notebook (or this one, or this one ) for a different linear regression or logistic regression problem. This will help solidify your understanding, and give you a chance to differentiate the generic patters in machine learning from problem-specific details.
Here are some sources to find good datasets:
We also recommend that you write a blog about your approach to the problem. Here is a suggested structure for your post (feel free to experiment with it):
As with the previous assignment, you can embed Juptyer notebook cells & outputs from Jovian into your blog.
Don't forget to share your work on the forum: https://jovian.ml/forum/t/share-your-work-here-assignment-2/4931
jovian.commit(project=project_name, environment=None)
jovian.commit(project=project_name, environment=None) # try again, kaggle fails sometimes
[jovian] Attempting to save notebook..
[jovian] Updating notebook "hiteshkumar-1mv17cs042/02-insurance-linear-regression" on https://jovian.ml/
[jovian] Uploading notebook..
[jovian] Attaching records (metrics, hyperparameters, dataset etc.)
[jovian] Committed successfully! https://jovian.ml/hiteshkumar-1mv17cs042/02-insurance-linear-regression
[jovian] Attempting to save notebook..