Learn practical skills, build real-world projects, and advance your career

Solar Power Generation Analysis Plant 2

Solar power generation and sensor data for two power plants is obtained from Kaggle.
This data has been gathered at two solar power plants in India over a 34 day period. It has two pairs of files - each pair has one power generation dataset and one sensor readings dataset. The power generation datasets are gathered at the inverter level - each inverter has multiple lines of solar panels attached to it. The sensor data is gathered at a plant level - single array of sensors optimally placed at the plant.
Solar Power Generation Analysis for Plant 2 will be carried out.

Task Details

1. Load the data from the CSV files
2. Explore each dataset - columns, counts, basic stats
3. Understand the domain context and explore underlying patterns in the data
4. Is there any missing data?
5. Pre-process the data on date and time.

  Explore the data and  answer the following questions -

  1. What is the mean value of daily yield?
  2. What is the total irradiation per day?
  3. What is the max ambient and module temperature?
  4. How many inverters are there for each plant?
  5. What is the maximum/minimum amount of DC/AC Power generated in a time interval/day?
  6. Which inverter (source_key) has produced maximum DC/AC power?
  7. Rank the inverters based on the DC/AC power they produce

Further tasks

  1. Create visualizations that help understand the data and underlying patterns.

  2. Start with graphs that explain the patterns for attributes independent of other variables. 
     These will usually be tracked as changes of attributes against DATETIME, DATE, or TIME. 
     Examples - how is DC or AC Power changing as time goes by? how is irradiation changing as
     time goes by? how are ambient and module temperature changing as time goes by? how does 
     yield change as time goes by? Explore plotting variables against different granularities
     of DATETIME and which is the best option for each variable.

  3. Plot two variables against each other to discover degree of correlation between them.
     Try out different variable pairs - ambient and module temperature, DC and AC Power, 
     Irradiation and module/ambient temperature, irradiation and DC/AC Power. Can you find 
     different ways of visualizing the above relationships ?

The following libraries will be used for data analysis and visualisation Pandas, Numpy, Matplotlib and Seaborn

About Course:

This project work is for partial fulfillment of the online certification course 'Zero to Pandas' [the course] (http://zerotopandas.com) which was provided by jovian.ml. This course covered from basic concepts of python and Numpy and moved towards data analysis with Pandas and plotting tools like matplotlib and Seaborn and steps to use these knowledge to analyse a real world data. The video lectures were very good and specific towards course objectives. The instructor's presentations were neat and beautiful with smooth transition of topics. Infact this is one of the best online courses i have come across in terms of course objectives fulfillment, it helped me a lot to build up my basic foundation on data analysis.
Thanks to Aakash N S and team Jovian and (http://freecodecamp.org).

project_name = "Solar_Power_Generation_Analysis Plant 2" 
!pip install jovian --upgrade -q
import jovian
jovian.commit(project=project_name)