Learn practical skills, build real-world projects, and advance your career

System Setup

List of all the python libraries that are required

  • numpy
  • pandas
  • matplotlib
  • seaborn
  • wordcloud
  • emoji
  • jovian

Run the following command to get all the listed python libraries

pip install numpy pandas matplotlib seaborn wordcloud emoji jovian --upgrade

Te check whether do you all the required libraries the next should run without any errors

pip install numpy pandas matplotlib seaborn wordcloud emoji jovian --upgrade
Collecting numpy Downloading numpy-1.20.1-cp38-cp38-manylinux2010_x86_64.whl (15.4 MB) |████████████████████████████████| 15.4 MB 97 kB/s eta 0:00:01 |███████████▎ | 5.4 MB 4.5 MB/s eta 0:00:03 Collecting pandas Downloading pandas-1.2.2-cp38-cp38-manylinux1_x86_64.whl (9.7 MB) |████████████████████████████████| 9.7 MB 70.2 MB/s eta 0:00:01 Collecting matplotlib Downloading matplotlib-3.3.4-cp38-cp38-manylinux1_x86_64.whl (11.6 MB) |████████████████████████████████| 11.6 MB 78.4 MB/s eta 0:00:01 Collecting seaborn Downloading seaborn-0.11.1-py3-none-any.whl (285 kB) |████████████████████████████████| 285 kB 14.4 MB/s eta 0:00:01 Collecting wordcloud Downloading wordcloud-1.8.1-cp38-cp38-manylinux1_x86_64.whl (371 kB) |████████████████████████████████| 371 kB 83.4 MB/s eta 0:00:01 Collecting emoji Downloading emoji-1.2.0-py3-none-any.whl (131 kB) |████████████████████████████████| 131 kB 14.3 MB/s eta 0:00:01 Requirement already up-to-date: jovian in /opt/conda/lib/python3.8/site-packages (0.2.32) Requirement already satisfied, skipping upgrade: python-dateutil>=2.7.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2.8.1) Requirement already satisfied, skipping upgrade: pytz>=2017.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2020.1) Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (1.2.0) Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (2.4.7) Requirement already satisfied, skipping upgrade: cycler>=0.10 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (0.10.0) Requirement already satisfied, skipping upgrade: pillow>=6.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (8.0.0) Requirement already satisfied, skipping upgrade: scipy>=1.0 in /opt/conda/lib/python3.8/site-packages (from seaborn) (1.5.2) Requirement already satisfied, skipping upgrade: click in /opt/conda/lib/python3.8/site-packages (from jovian) (7.1.2) Requirement already satisfied, skipping upgrade: pyyaml in /opt/conda/lib/python3.8/site-packages (from jovian) (5.3.1) Requirement already satisfied, skipping upgrade: requests in /opt/conda/lib/python3.8/site-packages (from jovian) (2.24.0) Requirement already satisfied, skipping upgrade: uuid in /opt/conda/lib/python3.8/site-packages (from jovian) (1.30) Requirement already satisfied, skipping upgrade: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0) Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (2020.6.20) Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (2.10) Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (3.0.4) Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (1.25.11) Installing collected packages: numpy, pandas, matplotlib, seaborn, wordcloud, emoji Attempting uninstall: numpy Found existing installation: numpy 1.19.2 Uninstalling numpy-1.19.2: Successfully uninstalled numpy-1.19.2 Attempting uninstall: pandas Found existing installation: pandas 1.1.3 Uninstalling pandas-1.1.3: Successfully uninstalled pandas-1.1.3 Attempting uninstall: matplotlib Found existing installation: matplotlib 3.3.2 Uninstalling matplotlib-3.3.2: Successfully uninstalled matplotlib-3.3.2 Attempting uninstall: seaborn Found existing installation: seaborn 0.11.0 Uninstalling seaborn-0.11.0: Successfully uninstalled seaborn-0.11.0 Successfully installed emoji-1.2.0 matplotlib-3.3.4 numpy-1.20.1 pandas-1.2.2 seaborn-0.11.1 wordcloud-1.8.1 Note: you may need to restart the kernel to use updated packages.
import re
import jovian
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS
import emoji
from collections import Counter

How to obtain Whatsapp Chat data

  • Open whatsapp
  • Open a Group/Inbox
  • Click on the 3 dotted options button
  • Click on more
  • Click on export chat
  • Click on without media
  • Export via Email/other IM's/....
  • Download to your system rename to chat-data.txt and put it in a folder

alt

Without media: exports 40k messages 
With media: exports 10k messages along with pictures/videos 
As im are doing chat data analysis i went with `without media` option 

Data Preprocessing

Use a custom a regex and datatime format by reffering to the above links if you run into empty df or format errors. As the exports from whatsapp are not standardized.