Learn practical skills, build real-world projects, and advance your career

Pytorch: Examining the Titanic Sinking with Ridge Regression

Examining the Data of more than 800 Titanic Passengers and training a Machine Learning Model on it

In this notebook we shall use this dataset containing data about passengers from the Titanic. Based on this data we will use a Ridge Regression model which just means a Logistic Regression model that uses L2 Regularization for predicting whether a person survived the sinking based on their passenger class, sex, the number of their siblings/spouses aboard, the number of their parents/children aboard and the fare they payed.

First we import everything we need for plotting data and creating a great model to make predictions on the data.

import pandas as pd
import numpy as np
import torch
import jovian
from torch.utils.data import DataLoader, TensorDataset, random_split
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.nn.functional as F
import seaborn as sns

Data Exploration

Here we can see what the data actually looks like. The first column indicates whether the person survived with a 1 or a 0 where the 1 stands for survival and the 0 for death. The rest of the columns are all our input columns used to predict the survival. We will however forget about as it does not hold important information needed to predict survival. You can also see below that we have 887 persons with their data and 8 total columns where 6 of them will be the input values and 1 of them (the Survived column) the corresponding label.

dataframe = pd.read_csv("./data/titanic.csv")
dataframe.head(10)
len(dataframe.index)
887