Learn practical skills, build real-world projects, and advance your career

Insurance cost prediction using linear regression

In this assignment we're going to use information like a person's age, sex, BMI, no. of children and smoking habit to predict the price of yearly medical bills. This kind of model is useful for insurance companies to determine the yearly insurance premium for a person. The dataset for this problem is taken from: https://www.kaggle.com/mirichoi0218/insurance

We will create a model with the following steps:

  1. Download and explore the dataset
  2. Prepare the dataset for training
  3. Create a linear regression model
  4. Train the model to fit the data
  5. Make predictions using the trained model

This assignment builds upon the concepts from the first 2 lectures. It will help to review these Jupyter notebooks:

As you go through this notebook, you will find a ??? in certain places. Your job is to replace the ??? with appropriate code or values, to ensure that the notebook runs properly end-to-end . In some cases, you'll be required to choose some hyperparameters (learning rate, batch size etc.). Try to experiment with the hypeparameters to get the lowest loss.

# Uncomment and run the commands below if imports fail
!conda install numpy pytorch torchvision cpuonly -c pytorch -y
!pip install matplotlib --upgrade --quiet
!pip install jovian --upgrade --quiet
Collecting package metadata (current_repodata.json): done Solving environment: done ==> WARNING: A newer version of conda exists. <== current version: 4.8.2 latest version: 4.8.3 Please update conda by running $ conda update -n base conda ## Package Plan ## environment location: /srv/conda/envs/notebook added / updated specs: - cpuonly - numpy - pytorch - torchvision The following packages will be downloaded: package | build ---------------------------|----------------- blas-2.15 | mkl 10 KB conda-forge ca-certificates-2020.4.5.2 | hecda079_0 147 KB conda-forge certifi-2020.4.5.2 | py37hc8dfbb8_0 152 KB conda-forge cpuonly-1.0 | 0 2 KB pytorch freetype-2.10.2 | he06d7ca_0 905 KB conda-forge intel-openmp-2020.1 | 217 780 KB defaults jpeg-9d | h516909a_0 266 KB conda-forge libblas-3.8.0 | 15_mkl 10 KB conda-forge libcblas-3.8.0 | 15_mkl 10 KB conda-forge libgfortran-ng-7.5.0 | hdf63c60_6 1.7 MB conda-forge liblapack-3.8.0 | 15_mkl 10 KB conda-forge liblapacke-3.8.0 | 15_mkl 10 KB conda-forge libpng-1.6.37 | hed695b0_1 308 KB conda-forge libtiff-4.1.0 | hc7e4089_6 668 KB conda-forge libwebp-base-1.1.0 | h516909a_3 845 KB conda-forge lz4-c-1.8.3 | he1b5a44_1001 187 KB conda-forge mkl-2020.1 | 217 129.0 MB defaults ninja-1.10.0 | hc9558a2_0 1.9 MB conda-forge numpy-1.18.5 | py37h8960a57_0 5.1 MB conda-forge olefile-0.46 | py_0 31 KB conda-forge openssl-1.1.1g | h516909a_0 2.1 MB conda-forge pillow-7.1.2 | py37h718be6c_0 658 KB conda-forge python_abi-3.7 | 1_cp37m 4 KB conda-forge pytorch-1.5.1 | py3.7_cpu_0 37.9 MB pytorch torchvision-0.6.1 | py37_cpu 11.0 MB pytorch zstd-1.4.4 | h3b9ef0a_2 982 KB conda-forge ------------------------------------------------------------ Total: 194.5 MB The following NEW packages will be INSTALLED: blas conda-forge/linux-64::blas-2.15-mkl cpuonly pytorch/noarch::cpuonly-1.0-0 freetype conda-forge/linux-64::freetype-2.10.2-he06d7ca_0 intel-openmp pkgs/main/linux-64::intel-openmp-2020.1-217 jpeg conda-forge/linux-64::jpeg-9d-h516909a_0 libblas conda-forge/linux-64::libblas-3.8.0-15_mkl libcblas conda-forge/linux-64::libcblas-3.8.0-15_mkl libgfortran-ng conda-forge/linux-64::libgfortran-ng-7.5.0-hdf63c60_6 liblapack conda-forge/linux-64::liblapack-3.8.0-15_mkl liblapacke conda-forge/linux-64::liblapacke-3.8.0-15_mkl libpng conda-forge/linux-64::libpng-1.6.37-hed695b0_1 libtiff conda-forge/linux-64::libtiff-4.1.0-hc7e4089_6 libwebp-base conda-forge/linux-64::libwebp-base-1.1.0-h516909a_3 lz4-c conda-forge/linux-64::lz4-c-1.8.3-he1b5a44_1001 mkl pkgs/main/linux-64::mkl-2020.1-217 ninja conda-forge/linux-64::ninja-1.10.0-hc9558a2_0 numpy conda-forge/linux-64::numpy-1.18.5-py37h8960a57_0 olefile conda-forge/noarch::olefile-0.46-py_0 pillow conda-forge/linux-64::pillow-7.1.2-py37h718be6c_0 python_abi conda-forge/linux-64::python_abi-3.7-1_cp37m pytorch pytorch/linux-64::pytorch-1.5.1-py3.7_cpu_0 torchvision pytorch/linux-64::torchvision-0.6.1-py37_cpu zstd conda-forge/linux-64::zstd-1.4.4-h3b9ef0a_2 The following packages will be UPDATED: ca-certificates 2019.11.28-hecc5488_0 --> 2020.4.5.2-hecda079_0 certifi 2019.11.28-py37_0 --> 2020.4.5.2-py37hc8dfbb8_0 openssl 1.1.1d-h516909a_0 --> 1.1.1g-h516909a_0 Downloading and Extracting Packages libgfortran-ng-7.5.0 | 1.7 MB | ##################################### | 100% torchvision-0.6.1 | 11.0 MB | ##################################### | 100% lz4-c-1.8.3 | 187 KB | ##################################### | 100% ninja-1.10.0 | 1.9 MB | ##################################### | 100% libpng-1.6.37 | 308 KB | ##################################### | 100% numpy-1.18.5 | 5.1 MB | ##################################### | 100% olefile-0.46 | 31 KB | ##################################### | 100% libwebp-base-1.1.0 | 845 KB | ##################################### | 100% pillow-7.1.2 | 658 KB | ##################################### | 100% python_abi-3.7 | 4 KB | ##################################### | 100% liblapacke-3.8.0 | 10 KB | ##################################### | 100% zstd-1.4.4 | 982 KB | ##################################### | 100% certifi-2020.4.5.2 | 152 KB | ##################################### | 100% pytorch-1.5.1 | 37.9 MB | ##################################### | 100% intel-openmp-2020.1 | 780 KB | ##################################### | 100% jpeg-9d | 266 KB | ##################################### | 100% liblapack-3.8.0 | 10 KB | ##################################### | 100% mkl-2020.1 | 129.0 MB | ##################################### | 100% cpuonly-1.0 | 2 KB | ##################################### | 100% blas-2.15 | 10 KB | ##################################### | 100% libtiff-4.1.0 | 668 KB | ##################################### | 100% openssl-1.1.1g | 2.1 MB | ##################################### | 100% libcblas-3.8.0 | 10 KB | ##################################### | 100% ca-certificates-2020 | 147 KB | ##################################### | 100% libblas-3.8.0 | 10 KB | ##################################### | 100% freetype-2.10.2 | 905 KB | ##################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
!pip install pandas
!pip install seaborn
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. Requirement already satisfied: pandas in /srv/conda/envs/notebook/lib/python3.7/site-packages (1.0.5) Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (2020.1) Requirement already satisfied: numpy>=1.13.3 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (1.18.5) Requirement already satisfied: python-dateutil>=2.6.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas) (2.8.1) Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from python-dateutil>=2.6.1->pandas) (1.14.0) WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. Collecting seaborn Downloading seaborn-0.10.1-py3-none-any.whl (215 kB) |████████████████████████████████| 215 kB 3.2 MB/s eta 0:00:01 Requirement already satisfied: matplotlib>=2.1.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (3.2.2) Requirement already satisfied: pandas>=0.22.0 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (1.0.5) Requirement already satisfied: numpy>=1.13.3 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from seaborn) (1.18.5) Collecting scipy>=1.0.1 Downloading scipy-1.4.1-cp37-cp37m-manylinux1_x86_64.whl (26.1 MB) |████████████████████████████████| 26.1 MB 296 kB/s eta 0:00:01 Requirement already satisfied: cycler>=0.10 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (1.2.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (2.4.7) Requirement already satisfied: python-dateutil>=2.1 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from matplotlib>=2.1.2->seaborn) (2.8.1) Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.7/site-packages (from pandas>=0.22.0->seaborn) (2020.1) Requirement already satisfied: six in /srv/conda/envs/notebook/lib/python3.7/site-packages (from cycler>=0.10->matplotlib>=2.1.2->seaborn) (1.14.0) Installing collected packages: scipy, seaborn Successfully installed scipy-1.4.1 seaborn-0.10.1
import torch
import jovian
import torchvision
import torch.nn as nn
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision.datasets.utils import download_url
from torch.utils.data import DataLoader, TensorDataset, random_split
project_name='02-insurance-linear-regression' # will be used by jovian.commit