How to deal with missing values (NaN)?
NaN refer to empty or there is no values that’s it
I know this but my question is more related to statistics.
- A Simple Option: Drop Columns with Missing Values . If your data is in a Data Frame called original data , you can drop columns with missing values
- A Better Option: Imputation. Imputation fills in the missing value with some number. …
- An Extension To Imputation.
df.dropna() if you want to eliminate the row entirely from analysis, df.fillna(df.mean()) if you would like to replace missing values with something like the mean and keep the rows for analysis
you want to ask whether NAN will be treated as
0? For example , what happen if we sum the values? Or more importantly, what happen when we average the values because it takes into account for the divider. If you dont use
np.nanmean as example,
NAN will be take into account
If this was indeed a data entry error, we can use one of the following approaches for dealing with the missing or faulty value:
to deal with NaN values:= chose one which suits your data
- Replace it with
- Repalce it with the average of the entire column
- Replace it with the average of the values on the previous & next date
- Discard the row entirely
Which approach you pick requires some context about the data and the problem.
hope this will help!!