Introduction
Data visualization is the graphic representation of data. It involves producing images that communicate relationships among the represented data to viewers. Visualizing data is an essential part of data analysis and machine learning. We'll use Python libraries Matplotlib and Seaborn to learn and apply some popular data visualization techniques. We'll use the words chart, plot, and graph interchangeably in this tutorial.
To begin, let's install and import the libraries. We'll use the matplotlib.pyplot
module for basic plots like line & bar charts. It is often imported with the alias plt
. We'll use the seaborn
module for more advanced plots. It is commonly imported with the alias sns
.
!pip install matplotlib seaborn --upgrade --quiet
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
Notice this we also include the special command %matplotlib inline
to ensure that our plots are shown and embedded within the Jupyter notebook itself. Without this command, sometimes plots may show up in pop-up windows.
Line Chart
The line chart is one of the simplest and most widely used data visualization techniques. A line chart displays information as a series of data points or markers connected by straight lines. You can customize the shape, size, color, and other aesthetic elements of the lines and markers for better visual clarity.
Here's a Python list showing the yield of apples (tons per hectare) over six years in an imaginary country called Kanto.