import pandas as pd
pd.__version__
'1.0.5'
Reading a CSV file using Pandas
Pandas is typically used for working in tabular data (simliar to the data stored in a spreadsheet). Pandas provides helper functions to read data from various file formates like CSV, Excel spreadsheets, HTML tables, JSON, SQL and more. Let's download a file italy-covid-daywise.txt
which contains daywise Covid-19 data for Italy in the following format:
date,new_cases,new_deaths,new_tests
2020-04-21,2256.0,454.0,28095.0
2020-04-22,2729.0,534.0,44248.0
2020-04-23,3370.0,437.0,37083.0
2020-04-24,2646.0,464.0,95273.0
2020-04-25,3021.0,420.0,38676.0
2020-04-26,2357.0,415.0,24113.0
2020-04-27,2324.0,260.0,26678.0
2020-04-28,1739.0,333.0,37554.0
...
This format of storing data is known as comma separated values or CSV.
CSVs: A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. A CSV file typically stores tabular data (numbers and text) in plain text, in which case each line will have the same number of fields. (Wikipedia)
We'll download this file using the urlretrieve
function from the urllib.request
module.
from urllib.request import urlretrieve
urlretrieve('https://covid.ourworldindata.org/data/owid-covid-data.csv', 'world-covid.csv')
('world-covid.csv', <http.client.HTTPMessage at 0x202062f9fd0>)
Reading the csv file locally with pd.read_csv