Learn practical skills, build real-world projects, and advance your career

Retail Store Business - Analysis

alt

Introduction

About - Dataset

In this project, we analyze a dataset that contains records of sales orders from a retail store in the United States. The dataset consists of 9994 records with 13 attributes (columns or features).

Each record (row) represents a sales order made by a retail store which has 13 attributes, and each attribute (column) represents information about the sales order including shipment mode, type of customer, location of customer, product quantity, selling price, discount, and profit amount.

A sales order is a document generated by a seller for its internal use in processing a customer order.

About - Project

This is an Exploratory Data Analysis project (EDA) which is a process of analyzing a dataset and synthesizing the key insights and characteristics of the data. It's one of the first steps in a Data Science Project to gain a deeper understanding of the data.

Dataset - Source

Please click here to download the dataset.

Download the Dataset

There are several options for getting the dataset into Jupyter:

  • Download the CSV manually and upload it via Jupyter's GUI

  • Use the urlretrieve function from the urllib.request to download CSV files from a raw URL

  • Use a helper library, e.g., opendatasets, which contains a collection of curated datasets and provides a helper function for direct download.

Initially, I downloaded the CSV file dataset manually. Later, I uploaded the same dataset to my Github profile, to fetch the dataset directly with just few lines of code (using urllib.request.urlretrieve function), just for my convenience.

Let's assign github raw url of the dataset (which is already downloaded manually) to the variable named dataset_url.

#assign the dataset url to a variable
dataset_url = "https://raw.githubusercontent.com/lafirm/datasets/main/SampleSuperstore.csv"
#import urlretrieve function to download the dataset 
from urllib.request import urlretrieve
#name the file as retail_store_dataset.csv
urlretrieve(dataset_url, 'retail_store_dataset.csv')
('retail_store_dataset.csv', <http.client.HTTPMessage at 0x7fccbc74f760>)

We downloaded the CSV file (dataset) using urlretrieve function from urllib.request module. And we named it as retail_store_dataset.csv.

Let's check whether the dataset was downloaded into the current working directory using listdir() function from os module.