Course Project on Exploratory Data Analysis - Discuss and Share Your Work

I also have the same question on uploading the file.


6 posts were split to a new topic: How to upload CSV file on

Please read the whole line in pink region.There will be a Warning somewhere. It normally comes when a method is going to deprecated in the future or other warnings.

If you add a semi-colon ‘;’ to the end of the last line of your code like this

sns.countplot(patients_copy_df.diagnosis) ;

then the warning will not show.

Try this it may remove the warning in the pink region
This normally happens while using matplotlib and seaborn libraries


You could add the following lines at the first of your Notebook where you are importing the libraries

import warnings

The above line will suppress the warning not just this one but every warning in the notebook

1 Like

I realize your post is 4 days old now and you most likely have an answer. But if not, give a try; they have quite a few datasets of manageable size.

Hi, even though I have written the following code, I still have to upload my csv file everytime I run binder.

jovian.commit(project= project_name, environment=None, files= [‘Power generation India.csv’])
How can I save it on my notebook permanently ?

1 Like

I’ve just tested the files argument.


Seems like this function is replacing spaces with underscores. The users seem to be really persistent in using spaces in their filenames. Many problems and misunderstandings happen.


As mentioned above:

  1. Your file shouldn’t have spaces. Replace them with underscores (_ symbol).
  2. Make sure that you upload files in last call to jovian.commit(). Example:
jovian.commit(project='important', files=['file.csv'])
# some cells
# calculations
# graphs, plots, histograms
jovian.commit(project='important') # LAST COMMIT in the notebook

This is wrong, because the second commit is specified without files argument, so the notebook version gets created without this file. For safety I would suggest doing this:

jovian.commit(project='important', files=['file.csv'])
# some cells
# calculations
# graphs, plots, histograms
jovian.commit(project='important', files=['file.csv']) # LAST COMMIT in the notebook
1 Like

Watch this video it will help you.

1 Like

Yes exactly I came to know that after googling it. anyway, thank you for your explanation.

Great work , I would recommend you to write some explanations about the project by using markdown cells

Sharing is caring …:slight_smile:

I took a look to the project I think you still have ways to improve and add more of visualizations. Also, if I am not mistaken the project should have a minimum 5 visualizations?

1 Like

Hi guys,

Please find the my course projects on Automobile datasets! I have tried to implement most of the data analysis piece here.

Feel free to ask or suggest any changes. Your feedbacks are appreciated.


Hi Folks,
I am trying to replace the row values in a pandas dataframe column based on the value of an another column in the same row. I am using a for loop for this as below. The DataFrame is called ‘df_new_employmenttype’. The values in the column “Employment” exist. I have added another column named “EmploymentType”, in which I am trying to put the values “Enthusiasts” or “Professional” based on the value in the existing column “Empolyment”.
Is there any shortcut to do this task?
My code with the ‘for loop’ is as below:

for i in range(0,df_new_employmenttype.shape[0]): # go through each row
emp = df_new_employmenttype.iloc[i,0] # variable for the existing column value
if emp == ‘Independent contractor, freelancer, or self-employed’ or emp == ‘Employed full-time’ or emp == ‘Employed part-time’:
df_new_employmenttype.iloc[i,1] = ‘Professional’
if emp == ‘Student’ or emp == ‘Not employed, but looking for work’ or emp == ‘Not employed, and not looking for work’ or emp ==‘Retired’:
df_new_employmenttype.iloc[i,1] = ‘Enthusiast’

To the global zerotopandas braintrust:

Does anyone have any tips on plotting a fixed categorical variable as a bar chart?
I just want to illustrate the count for each of the 2-3 possibilities that this variable stores. The variables are currently in string format.



I’m interested. I don’t know, but out of my intuition, I guess it may work as if uploading data file. Perhaps, you guys have tried this.

1 Like

I may not have the answer for you, but I’m interested in your question.

How about trying replacing your code sns.countplot(patients_copy_df.diagnosis) with sns.countplot(x='diagnosis', data=patients_copy_df.diagnosis)? Will this resolve the issue?

Thanks in advance for testing this out for me.

It’s my resolution to your question. It works, but I’m not sure if it’s the best/right resolution.

Every time you do the commit, do the your commit statement again:
jovian.commit(project= project_name, environment=None, files= [‘Power generation India.csv’])

Or at least the very last time you commit before closing the notebook, you have to use the above commit statement. For those commit statements in between, you can just do jovian.commit().

1 Like

ookkk!! thank you :slight_smile:

You don’t need to use a loop to do row operations. pandas will do this for you implicitly. To assign values to a column based on another column, you can use the method .apply() against the independent column`. Sorry, my English is bad, and I’m not very good in describing things. So, pls see my example below, and hope you get what I mean. Here is my edit of your code. Pls check if it works. Mind you my code may not be error free.

def assignEmpType(emp):
    if emp in ('Independent contractor, freelancer, or self-employed', 'Employed full-time', 'Employed part-time'):
        emptype = 'Professional'
    elif emp in ('Student', 'Not employed, but looking for work', 'Not employed, and not looking for work', 'Retired'):
        emptype = 'Enthusiast'
        emptype = '??Err'  # this is just my practice to cover all possibilities to avoid 'unknown' outcome in future
    return emptype

df_employmenttype['EmploymentType'] = df_new_employmenttype['Employment'].apply(lambda x: assignEmpType(x))
1 Like


The original dataset I was working with had 170 rows however, some of the rows had ‘Nodata’ for all of the 11 columns for certain countries in the dataset. I decided to drop these rows but now the dataset has 147 rows. Will this now fail the criteria for acceptance of 150 rows?