Learn data science and machine learning by building real-world projects on Jovian

FIFA19 Players Data Analysis

FIFA19 is the official football game of EA Sports. Queries in the following project are based on the Player Dataset of FIFA19. This Dataset is available in Kaggle which is a hub of datasets. This dataset consists of details of players and their stats in the year 2019. This can be used to determine success ratio, ratings, top players etc.

In this project I have used Numpy, Pandas, Matplotlib and Seaborn. This project is part of the course Data Analysis with Python: Zero to Pandas.

How to run the code

This is an executable Jupyter notebook hosted on Jovian.ml, a platform for sharing data science projects. You can run and experiment with the code in a couple of ways: using free online resources (recommended) or on your own computer.

Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on mybinder.org, a free online service for running Jupyter notebooks. You can also select "Run on Colab" or "Run on Kaggle".

Option 2: Running on your computer locally
  1. Install Conda by following these instructions. Add Conda binaries to your system PATH, so you can use the conda command on your terminal.

  2. Create a Conda environment and install the required libraries by running these commands on the terminal:

conda create -n zerotopandas -y python=3.8 
conda activate zerotopandas
pip install jovian jupyter numpy pandas matplotlib seaborn opendatasets --upgrade
  1. Press the "Clone" button above to copy the command for downloading the notebook, and run it on the terminal. This will create a new directory and download the notebook. The command will look something like this:
jovian clone notebook-owner/notebook-id
  1. Enter the newly created directory using cd directory-name and start the Jupyter notebook.
jupyter notebook

You can now access Jupyter's web interface by clicking the link that shows up on the terminal or by visiting http://localhost:8888 on your browser. Click on the notebook file (it has a .ipynb extension) to open it.

1. Downloading the Dataset

Datasets can be downloaded within Jupyter using the opendatasets Python Library.

In [1]:
!pip install jovian opendatasets --upgrade --quiet

Let's begin by downloading the data, and listing the files within the dataset.

In [2]:
dataset_url = 'https://www.kaggle.com/karangadiya/fifa19'
In [3]:
import opendatasets as od
Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds Your Kaggle username: gauravbisht005 Your Kaggle Key: ········
100%|██████████| 2.18M/2.18M [00:00<00:00, 103MB/s]
Downloading fifa19.zip to ./fifa19

The dataset has been downloaded and extracted.

In [4]:
data_dir = './fifa19'
In [5]:
import os

Let us save and upload our work to Jovian before continuing.

In [6]:
project_name = "FIFA19-Player-Data-Analysis"
In [7]:
!pip install jovian --upgrade -q
In [8]:
import jovian
In [9]:
[jovian] Attempting to save notebook.. [jovian] Please enter your API key ( from https://jovian.ml/ ): API KEY: ········ [jovian] Updating notebook "gauravbisht005/fifa19-player-data-analysis" on https://jovian.ml/ [jovian] Uploading notebook.. [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ml/gauravbisht005/fifa19-player-data-analysis

2. Data Preparation and Cleaning

Data Cleaning is the process of finding and correcting inaccurate or incomplete records in dataset by replacing/ modifying/ removing those records so that the dataset is prepared to be operated.

In [10]:
import pandas as pd
In [11]:
fifa19_df = pd.read_csv(data_dir + "/data.csv")
In [12]:

2.1 Determining the number of attributes in the dataset

In [13]:
Index(['Unnamed: 0', 'ID', 'Name', 'Age', 'Photo', 'Nationality', 'Flag',
       'Overall', 'Potential', 'Club', 'Club Logo', 'Value', 'Wage', 'Special',
       'Preferred Foot', 'International Reputation', 'Weak Foot',
       'Skill Moves', 'Work Rate', 'Body Type', 'Real Face', 'Position',
       'Jersey Number', 'Joined', 'Loaned From', 'Contract Valid Until',
       'Height', 'Weight', 'LS', 'ST', 'RS', 'LW', 'LF', 'CF', 'RF', 'RW',
       'LAM', 'CAM', 'RAM', 'LM', 'LCM', 'CM', 'RCM', 'RM', 'LWB', 'LDM',
       'CDM', 'RDM', 'RWB', 'LB', 'LCB', 'CB', 'RCB', 'RB', 'Crossing',
       'Finishing', 'HeadingAccuracy', 'ShortPassing', 'Volleys', 'Dribbling',
       'Curve', 'FKAccuracy', 'LongPassing', 'BallControl', 'Acceleration',
       'SprintSpeed', 'Agility', 'Reactions', 'Balance', 'ShotPower',
       'Jumping', 'Stamina', 'Strength', 'LongShots', 'Aggression',
       'Interceptions', 'Positioning', 'Vision', 'Penalties', 'Composure',
       'Marking', 'StandingTackle', 'SlidingTackle', 'GKDiving', 'GKHandling',
       'GKKicking', 'GKPositioning', 'GKReflexes', 'Release Clause'],

2.2 Cleaning the dataset

In [14]:
missing_data = pd.isna(fifa19_df.columns).sum()

Fortunately, there was no missing data in the dataset!!

In [15]:
fifa19_df.drop("Unnamed: 0",axis=1, inplace= True)

There was no missing data in the dataset but an unnecessary attribute which has been removed above.

In [16]:
import jovian
In [17]:
[jovian] Attempting to save notebook.. [jovian] Updating notebook "gauravbisht005/fifa19-player-data-analysis" on https://jovian.ml/ [jovian] Uploading notebook.. [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ml/gauravbisht005/fifa19-player-data-analysis

3. Exploratory Analysis and Visualization

In the following segment, I have analysed datasets to summarise general main characterstics using visual methods with the help of Python Libraries such as Matplotlib.pyplot and Seaborn.

Let's begin by importingmatplotlib.pyplot and seaborn.

In [18]:
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

3.1 Heatmap representing relation between the properties of attributes of football players

In [19]:
plt.figure(figsize = (25, 25))
sns.heatmap(fifa19_df.corr(), annot = True, fmt = '.1f')
plt.title("Corelation between the properties of attributes of football players")