Analysis of the Chernobyl Disaster influence in Air Contamination

Anderson Alves

September, 2020

This project is done as part of the course Data Analysis with Python: Zero to Pandas , lectured by Aakash N. S., and hosted on Jovian.ml.

Context

The Chernobyl disaster started in 1986's April 26th, when the Chernobyl's nuclear power plant reactor caught on fire, lasting for 10 days, resulting in the unprecedented release of radioactive material from a nuclear reactor. The power plant is located at 100 km from Kievin (Ukraine), but the effects of this accident reached far beyond its country borders.

The three most affected countries were Belarus, the Russian Federation, and Ukraine, but the accident's consequences were not limited to those territories. Several other European countries were affected by it, due to a massive atmospheric transfer of radioactive material. Until this day, the real impact of this event is still under controversy.

The assessment of the radionuclide intake with food and drinking water was based primarily on several measurements of I-131, Cs-134, and Cs-137 that were performed all over Europe [1], which are the radioisotopes reported in this dataset.

Dataset Content

This dataset presents concentration of Iodine-131 (I-131), Caesium-134 (Cs-134) and Caesium-137 (Cs-137) as aerosol particles which were measured in specific location and date. The following information is given in each column:

country
country code
locality name
lattitude (degrees.hundredths of degrees)
longitude (degrees.hundredths of degrees)
date (year/month/day)
hour of end of sampling (hours:minutes)
duration (hours.minutes)
I-131 concentration in Bq/m3 (aerosol particles)
Cs-134 concentration in Bq/m3 (aerosol particles)
Cs-137 concentration in Bq/m3 (aerosol particles)

Acknowledgements

The dataset was extracted from REM data bank at CEC Joint Research Centre Ispra. The data was downloaded from Kaggle's data bank and is also available on the JRC Directorate for Nuclear Safety and Security.

I - Data Preparation and Cleaning

In this first section, we will perform data cleaning, by removing non-relevant columns and correcting/removing wrong values, and preparing the dataset for the analysis that is going to be performed in the following sections, by handling missing and invalid values.

# get dataset file
main_df = pd.read_csv('chernobyl.csv')

# renaming columns and removing the ones that won't be used in the analysis
main_df.columns = ['country', 'country_code', 'locality', 'latitude', 'longitude', 'date', 'end_time', 'duration', 'iodine131', 'caesium134', 'caesium137']
main_df = main_df.drop(columns=['end_time', 'duration'])

# checking main dataframe
main_df.head()