Learn practical skills, build real-world projects, and advance your career
Updated 3 years ago
System Setup
List of all the python libraries that are required
- numpy
- pandas
- matplotlib
- seaborn
- wordcloud
- emoji
- jovian
Run the following command to get all the listed python libraries
pip install numpy pandas matplotlib seaborn wordcloud emoji jovian --upgrade
Te check whether do you all the required libraries the next should run without any errors
pip install numpy pandas matplotlib seaborn wordcloud emoji jovian --upgrade
Collecting numpy
Downloading numpy-1.20.1-cp38-cp38-manylinux2010_x86_64.whl (15.4 MB)
|████████████████████████████████| 15.4 MB 97 kB/s eta 0:00:01 |███████████▎ | 5.4 MB 4.5 MB/s eta 0:00:03
Collecting pandas
Downloading pandas-1.2.2-cp38-cp38-manylinux1_x86_64.whl (9.7 MB)
|████████████████████████████████| 9.7 MB 70.2 MB/s eta 0:00:01
Collecting matplotlib
Downloading matplotlib-3.3.4-cp38-cp38-manylinux1_x86_64.whl (11.6 MB)
|████████████████████████████████| 11.6 MB 78.4 MB/s eta 0:00:01
Collecting seaborn
Downloading seaborn-0.11.1-py3-none-any.whl (285 kB)
|████████████████████████████████| 285 kB 14.4 MB/s eta 0:00:01
Collecting wordcloud
Downloading wordcloud-1.8.1-cp38-cp38-manylinux1_x86_64.whl (371 kB)
|████████████████████████████████| 371 kB 83.4 MB/s eta 0:00:01
Collecting emoji
Downloading emoji-1.2.0-py3-none-any.whl (131 kB)
|████████████████████████████████| 131 kB 14.3 MB/s eta 0:00:01
Requirement already up-to-date: jovian in /opt/conda/lib/python3.8/site-packages (0.2.32)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.7.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2.8.1)
Requirement already satisfied, skipping upgrade: pytz>=2017.3 in /opt/conda/lib/python3.8/site-packages (from pandas) (2020.1)
Requirement already satisfied, skipping upgrade: kiwisolver>=1.0.1 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (1.2.0)
Requirement already satisfied, skipping upgrade: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (2.4.7)
Requirement already satisfied, skipping upgrade: cycler>=0.10 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied, skipping upgrade: pillow>=6.2.0 in /opt/conda/lib/python3.8/site-packages (from matplotlib) (8.0.0)
Requirement already satisfied, skipping upgrade: scipy>=1.0 in /opt/conda/lib/python3.8/site-packages (from seaborn) (1.5.2)
Requirement already satisfied, skipping upgrade: click in /opt/conda/lib/python3.8/site-packages (from jovian) (7.1.2)
Requirement already satisfied, skipping upgrade: pyyaml in /opt/conda/lib/python3.8/site-packages (from jovian) (5.3.1)
Requirement already satisfied, skipping upgrade: requests in /opt/conda/lib/python3.8/site-packages (from jovian) (2.24.0)
Requirement already satisfied, skipping upgrade: uuid in /opt/conda/lib/python3.8/site-packages (from jovian) (1.30)
Requirement already satisfied, skipping upgrade: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (2020.6.20)
Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (2.10)
Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (3.0.4)
Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.8/site-packages (from requests->jovian) (1.25.11)
Installing collected packages: numpy, pandas, matplotlib, seaborn, wordcloud, emoji
Attempting uninstall: numpy
Found existing installation: numpy 1.19.2
Uninstalling numpy-1.19.2:
Successfully uninstalled numpy-1.19.2
Attempting uninstall: pandas
Found existing installation: pandas 1.1.3
Uninstalling pandas-1.1.3:
Successfully uninstalled pandas-1.1.3
Attempting uninstall: matplotlib
Found existing installation: matplotlib 3.3.2
Uninstalling matplotlib-3.3.2:
Successfully uninstalled matplotlib-3.3.2
Attempting uninstall: seaborn
Found existing installation: seaborn 0.11.0
Uninstalling seaborn-0.11.0:
Successfully uninstalled seaborn-0.11.0
Successfully installed emoji-1.2.0 matplotlib-3.3.4 numpy-1.20.1 pandas-1.2.2 seaborn-0.11.1 wordcloud-1.8.1
Note: you may need to restart the kernel to use updated packages.
import re
import jovian
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud, STOPWORDS
import emoji
from collections import Counter
How to obtain Whatsapp Chat data
- Open whatsapp
- Open a Group/Inbox
- Click on the 3 dotted options button
- Click on more
- Click on export chat
- Click on without media
- Export via Email/other IM's/....
- Download to your system rename to chat-data.txt and put it in a folder
Without media: exports 40k messages
With media: exports 10k messages along with pictures/videos
As im are doing chat data analysis i went with `without media` option
Data Preprocessing
- Regex cheatsheet
- Regex test - live
- Datetime format
Use a custom a regex and datatime format by reffering to the above links if you run into empty df or format errors. As the exports from whatsapp are not standardized.