Join the “Zero to Data Analyst” Bootcamp. Limited seats.

A 20-week program covering 7 courses, 12 assignments, 4 real-world projects, and 12 months of career support.

The following notebook fetches data from the csv file "DATA.csv" within the specified dates. Our objective is to search the file for given criterias , make a report and send the autogenerated email to the reciever. The python library used to perform the specified operations is Pandas.

```
Q. Below are a few questions that I have framed for the sales data. Please make sure you take the start and end dates from the user using an input in the mm/dd/yyyy format. All the below questions need to fetch data within those dates.
1.total sales amount within the period
2.total number of sales within the period
3.number of sales city wise within the period
4.number of sales by productline within the period
5.What is the average price of items sold. How many items were sold which costed above the average price and how many items 6.were sold which costed less than the average? (within the period)
7.How many orders were cancelled and how many were shipped?
8.Which product should be given focus to increase profitability. (it would ideally be the product which has a price above average price and is being sold the most)
9.which product should be eliminated. (the product which is cheap and is still not being sold enough ie has the minimum quantity)
```

In [1]:

```
import pandas as pd
import numpy as np
import datetime
from datetime import *
data = pd.read_csv("data.csv");
m1, d1, y1 = [int(x) for x in input("Enter starting date(MM/DD/YYYY) : ").split('/')]
b1 = datetime(y1,m1,d1)
m2, d2, y2 = [int(z) for z in input("Enter end date(MM/DD/YYYY) : ").split('/')]
b2 = datetime(y2,m2,d2)
arr = []
for i in range(data["ORDERDATE"].count()):
date_time_obj = datetime.strptime(data["ORDERDATE"][i], '%m/%d/%Y %H:%M')
arr.append(date_time_obj)
data["DATEID"] = arr
total_sales_amount = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )]["SALES"].sum()
number_of_sales = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )]["QUANTITYORDERED"].sum()
number_of_sales_citywise = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )].groupby("CITY").sum()["QUANTITYORDERED"]
number_of_sales_by_productline = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )].groupby("PRODUCTLINE").sum()["QUANTITYORDERED"]
avg_price_of_items_groupwise = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )].groupby("PRODUCTLINE")["PRICEEACH"].mean()
avg_price = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )]["PRICEEACH"].mean()
items_above_avg_price = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )&(data["PRICEEACH"] > avg_price)]["QUANTITYORDERED"].sum()
items_below_avg_price = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )&(data["PRICEEACH"] < avg_price)]["QUANTITYORDERED"].sum()
items_shipped = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )&(data["STATUS"] == "Shipped")]["QUANTITYORDERED"].sum()
items_cancelled = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )&(data["STATUS"] == "Cancelled")]["QUANTITYORDERED"].sum()
top_selling = data[(data["DATEID"] >= b1)& (data["DATEID"] <= b2 )& (data["PRICEEACH"] > avg_price)].groupby("PRODUCTLINE").sum()["QUANTITYORDERED"]
import smtplib
print("Enter the recievers email adress")
reciever = input()
server = smtplib.SMTP("smtp.gmail.com",587)
server.starttls()
server.login("taalib.zama@gmail.com","shbptggsihvsbdjh")
message = "Subject : DATASET REPORT \n\n\t\t\tREPORT\nStart_date : "+str(d1)+'/'+str(m1)+'/'+str(y1)+"\t\tEnd_date : "+str(d2)+'/'+str(m2)+'/'+str(y2)+"\n1.total sales amount within the period : "+str(total_sales_amount)+"\n2.total number of sales within the period : "+str(number_of_sales)+"\n3.number of sales city wise within the period\n"+str(number_of_sales_citywise)+"\n\n4.number of sales by productline within the period\n"+str(number_of_sales_by_productline)+"\n\n5.What is the average price of items sold\n"+str(avg_price_of_items_groupwise)+"\n\n6.How many items were sold which costed above the average price\n"+str(items_above_avg_price)+"\n\nhow many items were sold which costed less than the average? (within the period) "+str(items_above_avg_price)+"\n\n7.How many orders were cancelled\n"+str(items_cancelled)+"\nand how many were shipped?"+str(items_shipped)+"\n\n\n8.Which product should be given focus to increase profitability. (it would ideally be the product which has a price above average price and is being sold the most)which product should be eliminated. (the product which is cheap and is still not being sold enough ie has the minimum quantity)\n"+str(top_selling)
server.sendmail("taalib.zama@gmail.com",reciever,message)
```

```
Enter starting date(MM/DD/YYYY) : 5/3/2003
Enter end date(MM/DD/YYYY) : 3/4/2005
Enter the recievers email adress
asjragmallick0101@gmail.com
```

Out[1]:

`{}`

In [ ]:

` `