Lecture 6: Exploratory Data Analysis - A Case Study

You need to set the lessons you finished to complete, and once the assignments have been marked with pass grade, they’ll also be set to complete automatically. With all these, the percentage of your progress will show.

What if I only want to select certain rows?


I have downloaded two csv files from kaggle to my computer.
I can upload the two files to my jupyter notebook in binder but it gets
tiring uploading it every time i open a new notebook to start my work.
Is there any way to avoid this?


You can do that with survey_df.loc[15:32]

How much data pruning to be carried out without affecting the Explanability of the ML results?

my dataset has over 64000 rows and I need to drop the last 500 rows with elements from column 2

Actually, I am not sure of the advantages of using loc.

There should be a few ways to do this. The simplest I could work out now is

If you add a condition to the col_to_select as well, I have noticed that the indexing is not changed, so row 500:600 may not exist as a 100 rows anymore.

