Learn practical skills, build real-world projects, and advance your career

Customer Churn Prediction

# download the data
!wget https://github.com/x4nth055/pythoncode-tutorials/raw/master/machine-learning/customer-churn-detection/Churn_Modelling.csv
--2022-05-03 18:52:40-- https://github.com/x4nth055/pythoncode-tutorials/raw/master/machine-learning/customer-churn-detection/Churn_Modelling.csv Resolving github.com (github.com)... 140.82.112.4 Connecting to github.com (github.com)|140.82.112.4|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://raw.githubusercontent.com/x4nth055/pythoncode-tutorials/master/machine-learning/customer-churn-detection/Churn_Modelling.csv [following] --2022-05-03 18:52:40-- https://raw.githubusercontent.com/x4nth055/pythoncode-tutorials/master/machine-learning/customer-churn-detection/Churn_Modelling.csv Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 674857 (659K) [text/plain] Saving to: ‘Churn_Modelling.csv.1’ Churn_Modelling.csv 100%[===================>] 659.04K --.-KB/s in 0.04s 2022-05-03 18:52:41 (14.7 MB/s) - ‘Churn_Modelling.csv.1’ saved [674857/674857]

The Dataset

In the customer churn modeling dataset, we have 10000 rows (each representing a unique customer) with 15 columns: 14 features with one target feature (Exited). The data is composed of both numerical and categorical features:

The target column:

Exited — Whether the customer churned or not.

Numeric Features:

  • CustomerId: A unique ID of the customer.
  • CreditScore: The credit score of the customer,
  • Age: The age of the customer,
  • Tenure: The number of months the client has been with the firm.
  • Balance: Balance remaining in the customer account,
  • NumOfProducts: The number of products sold by the customer.
  • EstimatedSalary: The estimated salary of the customer.

Categorical Features:

  • Surname: The surname of the customer.
  • Geography: The country of the customer.
  • Gender: M/F
  • HasCrCard: Whether the customer has a credit card or not.
  • IsActiveMember: Whether the customer is active or not.
# install dependencies
!pip install ipython==7.22.0
!pip install joblib==1.0.1
!pip install lightgbm==3.3.1
!pip install matplotlib
!pip install numpy
!pip install pandas
!pip install scikit_learn==0.24.1
!pip install seaborn
Requirement already satisfied: ipython==7.22.0 in /usr/local/lib/python3.7/dist-packages (7.22.0) Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (4.8.0) Requirement already satisfied: backcall in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.2.0) Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.18.1) Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (57.4.0) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (3.0.29) Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (4.4.2) Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (5.1.1) Requirement already satisfied: pygments in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (2.6.1) Requirement already satisfied: pickleshare in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.7.5) Requirement already satisfied: parso<0.9.0,>=0.8.0 in /usr/local/lib/python3.7/dist-packages (from jedi>=0.16->ipython==7.22.0) (0.8.3) Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.7/dist-packages (from pexpect>4.3->ipython==7.22.0) (0.7.0) Requirement already satisfied: wcwidth in /usr/local/lib/python3.7/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython==7.22.0) (0.2.5) Requirement already satisfied: joblib==1.0.1 in /usr/local/lib/python3.7/dist-packages (1.0.1) Requirement already satisfied: lightgbm==3.3.1 in /usr/local/lib/python3.7/dist-packages (3.3.1) Requirement already satisfied: scikit-learn!=0.22.0 in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (0.24.1) Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (1.4.1) Requirement already satisfied: wheel in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (0.37.1) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (1.21.6) Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn!=0.22.0->lightgbm==3.3.1) (1.0.1) Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn!=0.22.0->lightgbm==3.3.1) (3.1.0) Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (3.2.2) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (2.8.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (0.11.0) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.4.2) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (3.0.8) Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.21.6) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib) (4.2.0) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib) (1.15.0) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (1.21.6) Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (1.3.5) Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (1.21.6) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (2022.1) Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (2.8.2) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas) (1.15.0) Requirement already satisfied: scikit_learn==0.24.1 in /usr/local/lib/python3.7/dist-packages (0.24.1) Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (3.1.0) Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.0.1) Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.4.1) Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.21.6) Requirement already satisfied: seaborn in /usr/local/lib/python3.7/dist-packages (0.11.2) Requirement already satisfied: matplotlib>=2.2 in /usr/local/lib/python3.7/dist-packages (from seaborn) (3.2.2) Requirement already satisfied: scipy>=1.0 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.4.1) Requirement already satisfied: pandas>=0.23 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.3.5) Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.21.6) Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (1.4.2) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (0.11.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (3.0.8) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (2.8.2) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib>=2.2->seaborn) (4.2.0) Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.23->seaborn) (2022.1) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from lightgbm import LGBMClassifier
from sklearn.metrics import roc_auc_score, recall_score, confusion_matrix, classification_report 
import subprocess
import joblib
# Get multiple outputs in the same cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# Ignore all warnings
import warnings
warnings.filterwarnings('ignore')
warnings.filterwarnings(action='ignore', category=DeprecationWarning)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)