Insurance cost prediction using linear regression
In this assignment we're going to use information like a person's age, sex, BMI, no. of children and smoking habit to predict the price of yearly medical bills. This kind of model is useful for insurance companies to determine the yearly insurance premium for a person. The dataset for this problem is taken from: https://www.kaggle.com/mirichoi0218/insurance
We will create a model with the following steps:
- Download and explore the dataset
- Prepare the dataset for training
- Create a linear regression model
- Train the model to fit the data
- Make predictions using the trained model
This assignment builds upon the concepts from the first 2 lectures. It will help to review these Jupyter notebooks:
- PyTorch basics: https://jovian.ml/aakashns/01-pytorch-basics
- Linear Regression: https://jovian.ml/aakashns/02-linear-regression
- Logistic Regression: https://jovian.ml/aakashns/03-logistic-regression
- Linear regression (minimal): https://jovian.ml/aakashns/housing-linear-minimal
- Logistic regression (minimal): https://jovian.ml/aakashns/mnist-logistic-minimal
As you go through this notebook, you will find a ??? in certain places. Your job is to replace the ??? with appropriate code or values, to ensure that the notebook runs properly end-to-end . In some cases, you'll be required to choose some hyperparameters (learning rate, batch size etc.). Try to experiment with the hypeparameters to get the lowest loss.
# Uncomment and run the commands below if imports fail
!conda install numpy pytorch torchvision cpuonly -c pytorch -y
!pip install matplotlib --upgrade --quiet
!pip install jovian --upgrade --quiet
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.8.2
latest version: 4.8.3
Please update conda by running
$ conda update -n base conda
## Package Plan ##
environment location: /srv/conda/envs/notebook
added / updated specs:
- cpuonly
- numpy
- pytorch
- torchvision
The following packages will be downloaded:
package | build
---------------------------|-----------------
blas-2.15 | mkl 10 KB conda-forge
ca-certificates-2020.4.5.2 | hecda079_0 147 KB conda-forge
certifi-2020.4.5.2 | py37hc8dfbb8_0 152 KB conda-forge
cpuonly-1.0 | 0 2 KB pytorch
freetype-2.10.2 | he06d7ca_0 905 KB conda-forge
intel-openmp-2020.1 | 217 780 KB defaults
jpeg-9d | h516909a_0 266 KB conda-forge
libblas-3.8.0 | 15_mkl 10 KB conda-forge
libcblas-3.8.0 | 15_mkl 10 KB conda-forge
libgfortran-ng-7.5.0 | hdf63c60_6 1.7 MB conda-forge
liblapack-3.8.0 | 15_mkl 10 KB conda-forge
liblapacke-3.8.0 | 15_mkl 10 KB conda-forge
libpng-1.6.37 | hed695b0_1 308 KB conda-forge
libtiff-4.1.0 | hc7e4089_6 668 KB conda-forge
libwebp-base-1.1.0 | h516909a_3 845 KB conda-forge
lz4-c-1.9.2 | he1b5a44_1 226 KB conda-forge
mkl-2020.1 | 217 129.0 MB defaults
ninja-1.10.0 | hc9558a2_0 1.9 MB conda-forge
numpy-1.18.5 | py37h8960a57_0 5.1 MB conda-forge
olefile-0.46 | py_0 31 KB conda-forge
pillow-7.1.2 | py37h718be6c_0 658 KB conda-forge
pytorch-1.5.0 | py3.7_cpu_0 90.5 MB pytorch
torchvision-0.6.0 | py37_cpu 11.0 MB pytorch
zstd-1.4.4 | h6597ccf_3 991 KB conda-forge
------------------------------------------------------------
Total: 245.0 MB
The following NEW packages will be INSTALLED:
blas conda-forge/linux-64::blas-2.15-mkl
cpuonly pytorch/noarch::cpuonly-1.0-0
freetype conda-forge/linux-64::freetype-2.10.2-he06d7ca_0
intel-openmp pkgs/main/linux-64::intel-openmp-2020.1-217
jpeg conda-forge/linux-64::jpeg-9d-h516909a_0
libblas conda-forge/linux-64::libblas-3.8.0-15_mkl
libcblas conda-forge/linux-64::libcblas-3.8.0-15_mkl
libgfortran-ng conda-forge/linux-64::libgfortran-ng-7.5.0-hdf63c60_6
liblapack conda-forge/linux-64::liblapack-3.8.0-15_mkl
liblapacke conda-forge/linux-64::liblapacke-3.8.0-15_mkl
libpng conda-forge/linux-64::libpng-1.6.37-hed695b0_1
libtiff conda-forge/linux-64::libtiff-4.1.0-hc7e4089_6
libwebp-base conda-forge/linux-64::libwebp-base-1.1.0-h516909a_3
lz4-c conda-forge/linux-64::lz4-c-1.9.2-he1b5a44_1
mkl pkgs/main/linux-64::mkl-2020.1-217
ninja conda-forge/linux-64::ninja-1.10.0-hc9558a2_0
numpy conda-forge/linux-64::numpy-1.18.5-py37h8960a57_0
olefile conda-forge/noarch::olefile-0.46-py_0
pillow conda-forge/linux-64::pillow-7.1.2-py37h718be6c_0
pytorch pytorch/linux-64::pytorch-1.5.0-py3.7_cpu_0
torchvision pytorch/linux-64::torchvision-0.6.0-py37_cpu
zstd conda-forge/linux-64::zstd-1.4.4-h6597ccf_3
The following packages will be UPDATED:
ca-certificates 2020.4.5.1-hecc5488_0 --> 2020.4.5.2-hecda079_0
certifi 2020.4.5.1-py37hc8dfbb8_0 --> 2020.4.5.2-py37hc8dfbb8_0
Downloading and Extracting Packages
cpuonly-1.0 | 2 KB | ##################################### | 100%
liblapacke-3.8.0 | 10 KB | ##################################### | 100%
libcblas-3.8.0 | 10 KB | ##################################### | 100%
libgfortran-ng-7.5.0 | 1.7 MB | ##################################### | 100%
freetype-2.10.2 | 905 KB | ##################################### | 100%
jpeg-9d | 266 KB | ##################################### | 100%
pytorch-1.5.0 | 90.5 MB | ##################################### | 100%
libtiff-4.1.0 | 668 KB | ##################################### | 100%
pillow-7.1.2 | 658 KB | ##################################### | 100%
libpng-1.6.37 | 308 KB | ##################################### | 100%
mkl-2020.1 | 129.0 MB | ##################################### | 100%
lz4-c-1.9.2 | 226 KB | ##################################### | 100%
libblas-3.8.0 | 10 KB | ##################################### | 100%
torchvision-0.6.0 | 11.0 MB | ##################################### | 100%
ca-certificates-2020 | 147 KB | ##################################### | 100%
zstd-1.4.4 | 991 KB | ##################################### | 100%
ninja-1.10.0 | 1.9 MB | ##################################### | 100%
blas-2.15 | 10 KB | ##################################### | 100%
certifi-2020.4.5.2 | 152 KB | ##################################### | 100%
numpy-1.18.5 | 5.1 MB | ##################################### | 100%
intel-openmp-2020.1 | 780 KB | ##################################### | 100%
libwebp-base-1.1.0 | 845 KB | ##################################### | 100%
liblapack-3.8.0 | 10 KB | ##################################### | 100%
olefile-0.46 | 31 KB | ##################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
!pip install pandas --quiet
import torch
import jovian
import torchvision
import torch.nn as nn
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torch.utils.data import DataLoader, TensorDataset, random_split
project_name='02-insurance-linear-regression' # will be used by jovian.commit