This is a beginner friendly notebook which aims to perform exploratory data analysis using graph visualizations.

We use the Linear regression model to predict car prices, post which we calculate the error percentage using the mean absolute error method and we try to make it better by manipulating our data input to the model(feature selection)

/kaggle/input/vehicle-dataset-from-cardekho/Car details v3.csv /kaggle/input/vehicle-dataset-from-cardekho/CAR DETAILS FROM CAR DEKHO.csv /kaggle/input/vehicle-dataset-from-cardekho/car data.csv
df_cardekho = pd.read_csv("/kaggle/input/vehicle-dataset-from-cardekho/CAR DETAILS FROM CAR DEKHO.csv")
df_cardata = pd.read_csv("/kaggle/input/vehicle-dataset-from-cardekho/car data.csv")
df_cardetails = pd.read_csv("/kaggle/input/vehicle-dataset-from-cardekho/Car details v3.csv")
<class 'pandas.core.frame.DataFrame'> RangeIndex: 4340 entries, 0 to 4339 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 name 4340 non-null object 1 year 4340 non-null int64 2 selling_price 4340 non-null int64 3 km_driven 4340 non-null int64 4 fuel 4340 non-null object 5 seller_type 4340 non-null object 6 transmission 4340 non-null object 7 owner 4340 non-null object dtypes: int64(3), object(5) memory usage: 271.4+ KB

We do not have any null values in our data from cardekho.csv