Learn practical skills, build real-world projects, and advance your career
#Hi,This is Yokhesh. In this program, I am using the google apps dataset to perform some data cleaning, 
#data visulaisation and some useful calculation that could help us better understand the data.
# Initially, we are assigning each data set to a variable.
# The variable 'data' contains the info from googleplaystore_user_reviews.csv
# The variable 'data1' contains the info from googleplaystore.csv
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt 

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))
data = pd.read_csv("../input/googleplaystore_user_reviews.csv")
data1 = pd.read_csv("../input/googleplaystore.csv")


data1 = data1.dropna()
data1.head()

# Any results you write to the current directory are saved as output.
data1.describe()
data1.info()
# Now, we will be be calculating the number of apps under each category and 
#then a bar chart to visualize the difference in the number of apps from one category to another