Learn practical skills, build real-world projects, and advance your career
Created 2 years ago
Customer Churn Prediction
# download the data
!wget https://github.com/x4nth055/pythoncode-tutorials/raw/master/machine-learning/customer-churn-detection/Churn_Modelling.csv
--2022-05-03 18:52:40-- https://github.com/x4nth055/pythoncode-tutorials/raw/master/machine-learning/customer-churn-detection/Churn_Modelling.csv
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/x4nth055/pythoncode-tutorials/master/machine-learning/customer-churn-detection/Churn_Modelling.csv [following]
--2022-05-03 18:52:40-- https://raw.githubusercontent.com/x4nth055/pythoncode-tutorials/master/machine-learning/customer-churn-detection/Churn_Modelling.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 674857 (659K) [text/plain]
Saving to: ‘Churn_Modelling.csv.1’
Churn_Modelling.csv 100%[===================>] 659.04K --.-KB/s in 0.04s
2022-05-03 18:52:41 (14.7 MB/s) - ‘Churn_Modelling.csv.1’ saved [674857/674857]
The Dataset
In the customer churn modeling dataset, we have 10000 rows (each representing a unique customer) with 15 columns: 14 features with one target feature (Exited). The data is composed of both numerical and categorical features:
The target column:
Exited — Whether the customer churned or not.
Numeric Features:
- CustomerId: A unique ID of the customer.
- CreditScore: The credit score of the customer,
- Age: The age of the customer,
- Tenure: The number of months the client has been with the firm.
- Balance: Balance remaining in the customer account,
- NumOfProducts: The number of products sold by the customer.
- EstimatedSalary: The estimated salary of the customer.
Categorical Features:
- Surname: The surname of the customer.
- Geography: The country of the customer.
- Gender: M/F
- HasCrCard: Whether the customer has a credit card or not.
- IsActiveMember: Whether the customer is active or not.
# install dependencies
!pip install ipython==7.22.0
!pip install joblib==1.0.1
!pip install lightgbm==3.3.1
!pip install matplotlib
!pip install numpy
!pip install pandas
!pip install scikit_learn==0.24.1
!pip install seaborn
Requirement already satisfied: ipython==7.22.0 in /usr/local/lib/python3.7/dist-packages (7.22.0)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (4.8.0)
Requirement already satisfied: backcall in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.2.0)
Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.18.1)
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (57.4.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (3.0.29)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (4.4.2)
Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (5.1.1)
Requirement already satisfied: pygments in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (2.6.1)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.7/dist-packages (from ipython==7.22.0) (0.7.5)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /usr/local/lib/python3.7/dist-packages (from jedi>=0.16->ipython==7.22.0) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.7/dist-packages (from pexpect>4.3->ipython==7.22.0) (0.7.0)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.7/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython==7.22.0) (0.2.5)
Requirement already satisfied: joblib==1.0.1 in /usr/local/lib/python3.7/dist-packages (1.0.1)
Requirement already satisfied: lightgbm==3.3.1 in /usr/local/lib/python3.7/dist-packages (3.3.1)
Requirement already satisfied: scikit-learn!=0.22.0 in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (0.24.1)
Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (1.4.1)
Requirement already satisfied: wheel in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (0.37.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from lightgbm==3.3.1) (1.21.6)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn!=0.22.0->lightgbm==3.3.1) (1.0.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn!=0.22.0->lightgbm==3.3.1) (3.1.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (3.2.2)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.4.2)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (3.0.8)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib) (1.21.6)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib) (4.2.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib) (1.15.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (1.21.6)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (1.3.5)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (1.21.6)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (2022.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Requirement already satisfied: scikit_learn==0.24.1 in /usr/local/lib/python3.7/dist-packages (0.24.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (3.1.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.0.1)
Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.4.1)
Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.7/dist-packages (from scikit_learn==0.24.1) (1.21.6)
Requirement already satisfied: seaborn in /usr/local/lib/python3.7/dist-packages (0.11.2)
Requirement already satisfied: matplotlib>=2.2 in /usr/local/lib/python3.7/dist-packages (from seaborn) (3.2.2)
Requirement already satisfied: scipy>=1.0 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.4.1)
Requirement already satisfied: pandas>=0.23 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.3.5)
Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.7/dist-packages (from seaborn) (1.21.6)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (1.4.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (0.11.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (3.0.8)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.2->seaborn) (2.8.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib>=2.2->seaborn) (4.2.0)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.23->seaborn) (2022.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, StandardScaler
from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from lightgbm import LGBMClassifier
from sklearn.metrics import roc_auc_score, recall_score, confusion_matrix, classification_report
import subprocess
import joblib
# Get multiple outputs in the same cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# Ignore all warnings
import warnings
warnings.filterwarnings('ignore')
warnings.filterwarnings(action='ignore', category=DeprecationWarning)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)