Jovian
⭐️
Sign In
Learn data science and machine learning by building real-world projects on Jovian

SOLAR POWER GENERATION!!!!!!!

Solar power is the conversion of sun radiation into electricity through the use of solar photovoltaic cells. This conversion takes place in the solar cell by photovoltaic effect. As said by many experts that the amount of solar energy reaching the earth is more than 10000 times the current energy consumption by man.

Also, the power created by solar is sufficient for one year for the entire planet, if we could convert the 100 percent of the solar energy into electricity in one hour.

i have learning all these innovative from #"Data Analysis with Python: Zero to Pandas" #Data Analysis with Python: Zero to Pandas" is a practical, beginner-friendly and coding-focused introduction to data analysis covering the basics of Python, Numpy, Pandas, data visualization and exploratory data analysis.

DOWNLOADING DATASET!!!!!

In [1]:
!pip install jovian opendatasets --upgrade --quiet
In [2]:
dataset_url = 'https://www.kaggle.com/anikannal/solar-power-generation-data' 
In [3]:
import opendatasets as od
od.download(dataset_url)
Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds Your Kaggle username: akanshasaini Your Kaggle Key: ········
100%|██████████| 1.90M/1.90M [00:00<00:00, 159MB/s]
Downloading solar-power-generation-data.zip to ./solar-power-generation-data

The dataset has been downloaded and extracted.

In [4]:
data_dir = './solar-power-generation-data'
In [5]:
import os
os.listdir(data_dir)
Out[5]:
['Plant_1_Generation_Data.csv',
 'Plant_2_Weather_Sensor_Data.csv',
 'Plant_1_Weather_Sensor_Data.csv',
 'Plant_2_Generation_Data.csv']

Let us save and upload our work to Jovian before continuing.

In [6]:
project_name = "zerotopandas-course-project-starter-solar power generation"
In [7]:
!pip install jovian --upgrade -q
In [8]:
import jovian
In [10]:
jovian.commit(project= "solar-power-generation")
[jovian] Attempting to save notebook.. [jovian] Updating notebook "akanshasaini888/solar-power-generation" on https://jovian.ml/ [jovian] Uploading notebook.. [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ml/akanshasaini888/solar-power-generation

Data Preparation and Cleaning

In [11]:
!pip install jovian --upgrade --quiet


In [12]:
!pip install numpy seaborn pandas matplotlib
Requirement already satisfied: numpy in /srv/conda/envs/notebook/lib/python3.8/site-packages (1.19.1) Requirement already satisfied: seaborn in /srv/conda/envs/notebook/lib/python3.8/site-packages (0.10.1) Requirement already satisfied: pandas in /srv/conda/envs/notebook/lib/python3.8/site-packages (1.1.2) Requirement already satisfied: matplotlib in /srv/conda/envs/notebook/lib/python3.8/site-packages (3.3.0) Requirement already satisfied: scipy>=1.0.1 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from seaborn) (1.5.2) Requirement already satisfied: python-dateutil>=2.7.3 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from pandas) (2.8.1) Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from pandas) (2020.1) Requirement already satisfied: kiwisolver>=1.0.1 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from matplotlib) (1.2.0) Requirement already satisfied: cycler>=0.10 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from matplotlib) (0.10.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from matplotlib) (2.4.7) Requirement already satisfied: pillow>=6.2.0 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from matplotlib) (7.2.0) Requirement already satisfied: six>=1.5 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
In [13]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import jovian
%matplotlib inline
In [14]:
solar_raw_df=pd.read_csv(data_dir +"/Plant_1_Generation_Data.csv")

In [15]:
solar_raw_df

Out[15]:
In [16]:
solar_raw_df.info()


<class 'pandas.core.frame.DataFrame'> RangeIndex: 68778 entries, 0 to 68777 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 DATE_TIME 68778 non-null object 1 PLANT_ID 68778 non-null int64 2 SOURCE_KEY 68778 non-null object 3 DC_POWER 68778 non-null float64 4 AC_POWER 68778 non-null float64 5 DAILY_YIELD 68778 non-null float64 6 TOTAL_YIELD 68778 non-null float64 dtypes: float64(4), int64(1), object(2) memory usage: 3.7+ MB
In [17]:
solar_raw_df.describe()
Out[17]:
In [18]:
solar_raw_df.shape
Out[18]:
(68778, 7)
In [19]:
## let find the nunique values in the data frame,it ignores the nan values


solar_raw_df.nunique()
Out[19]:
DATE_TIME       3158
PLANT_ID           1
SOURCE_KEY        22
DC_POWER       32909
AC_POWER       32686
DAILY_YIELD    29900
TOTAL_YIELD    37267
dtype: int64
In [20]:
## We check for Null values using isnull() function

solar_raw_df.isnull().sum()
Out[20]:
DATE_TIME      0
PLANT_ID       0
SOURCE_KEY     0
DC_POWER       0
AC_POWER       0
DAILY_YIELD    0
TOTAL_YIELD    0
dtype: int64
In [21]:
## this is the visual representation of above

sns.heatmap(solar_raw_df.isnull())
Out[21]:
<AxesSubplot:>
Notebook Image
In [22]:
nf_df=solar_raw_df.copy()
In [23]:
nf_df
Out[23]:
In [24]:
##  i have decided to drop the columns, 
nf_df=solar_raw_df.copy()

nf_df.drop(['DC_POWER','AC_POWER'],axis=1,inplace=True)

nf_df
Out[24]:
In [25]:
import jovian
In [26]:
jovian.commit()
[jovian] Attempting to save notebook.. [jovian] Updating notebook "akanshasaini888/solar-power-generation" on https://jovian.ml/ [jovian] Uploading notebook.. [jovian] Capturing environment.. [jovian] Committed successfully! https://jovian.ml/akanshasaini888/solar-power-generation

Exploratory Analysis and Visualization

In [27]:
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

sns.set_style('darkgrid')
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

TODO - Explore one or more columns by plotting a graph below, and add some explanation about it

In [28]:
sns.barplot(solar_raw_df.DC_POWER)
Out[28]:
<AxesSubplot:xlabel='DC_POWER'>
Notebook Image
In [29]:
nf_df['DAILY_YIELD'].value_counts()
Out[29]:
0.000000       18696
5803.000000       66
8435.000000       64
5965.000000       62
8273.000000       57
               ...  
5525.428571        1
1596.833333        1
2879.125000        1
2126.857143        1
76.250000          1
Name: DAILY_YIELD, Length: 29900, dtype: int64
In [30]:
sns.barplot(solar_raw_df.TOTAL_YIELD)
Out[30]:
<AxesSubplot:xlabel='TOTAL_YIELD'>
Notebook Image
In [31]:
sns.pairplot(nf_df);
Notebook Image
In [32]:
nf_df.corr()

Out[32]:
In [33]:
plt.hist(nf_df.corr());
Notebook Image
In [34]:
import jovian
In [ ]:
jovian.commit()

Inferences and Conclusion

*In this note book i have tried to analyze solar power generation system. *i choose this dataset from kaggle.com. *While working on this project, i researched lot of information regarding pandas and plots. *While doing this project i have realized that there is lot more to learn, i am excited to move forward in this journey of becoming a Data Analyst/Scientist. *I have refered to the pandas notebook and visulization notes from zero to pandas project to finish this project, i have done lot research as well to the learn th concepts of pandas and numpy.

In [32]:
import jovian
In [ ]:
jovian.commit()
In [ ]: