Deadline: Aug 14, 11:59 PM GMT
In the course project, you will apply the machine learning skills covered in this course by training an ML model on a real-world dataset. Follow these steps to complete your project:
Pick a large real-world dataset from Kaggle (see the "Recommended Datasets" section below) and download it using
opendatasets. Your training set should contain at least 50,000 rows and 5 columns of data.
Read the dataset description, understand the problem statement and describe the modeling objective clearly. You can also browse through existing notebooks created by others for inspiration.
Perform exploratory data analysis, gather insights about the data, perform feature engineering, create a training-validation split, and prepare the data for modeling.
Train & evaluate different machine learning models, tune hyperparameters and reduce overfitting to improve the model.
Report the final performance of your best model(s), show sample predictions, and save model weights. Summarize your work, share links to references, and suggest ideas for future work.
There is no starter notebook for the course project. Please use the "New" button on Jovian to create a new notebook, "Run on Colab" to execute it, and
jovian.commit to record versions. Please review the "Evaluation Criteria" and "Recommended Datasets" sections below carefully before starting your project.
Your submission must satisfy the following criteria:
Here are some project ideas to choose from:
NOTE: It is not compulsory to use one of the above datasets. You can select a dataset from any online source of your choice.