Project - Train a Deep Learning Model from Scratch

Deep Learning with PyTorch: Zero to GANs

For the course project, you will pick a dataset of your choice and apply the concepts learned in this course to train deep learning models end-to-end with PyTorch, experimenting with different hyperparameters & metrics.

  1. Find a large dataset (2000+ samples) online (see "Where to Find Datasets" below)
  2. Understand and describe the modeling objective clearly
    1. What type of data is it? (images, text, audio, etc.)
    2. What type of problem is it? (regression, classification, generative modeling, etc.)
  3. Clean the data if required and perform exploratory analysis (plot graphs, ask questions)
  4. Modeling
    1. Define a model (network architecture)
    2. Pick some hyperparameters
    3. Train the model
    4. Make predictions on samples
    5. Evaluate on the test dataset
    6. Save the model weights
    7. Try different hyperparameters & regularization
  5. Conclusions - summarize your learning & identify opportunities for future work
  6. Publish and submit your Jupyter notebook
  7. (Optional) Write a blog post to describe your experiments and summarize your work. Use Medium or Github pages.

NOTE: There is no starter notebook for the course project. Please create a new notebook on Google Colab, make it public, and submit the link to it at the bottom of this page.

Use the following sources to find interesting datasets:

Indian stocks data

Indian Air Quality Data

Indian Covid-19 Dataset

World Covid-19 Dataset

USA Covid-19 Dataset

Megapixels Dataset for Face Detection, GANs, Human Localization

Agriculture based dataset

India Digital Payments UPI

India Consumption of LPG

India Import/Export Crude OIl

US Unemployment Rate Data

India Road accident Data

Data science Jobs Data

H1-b Visa Data

Donald Trump’s Tweets

Hilary Clinton and Trump’s Tweets

Asteroid Dataset

Solar flares Data

Human face generation GANs

F-1 Race Data

Automobile Insurance

PUBG

CS GO

Dota 2

Cricket

Basketball

Football

Google Colab Public Notebook Link (Required)
Blog Link
You can submit multiple times. Only your last submission will be evaluated.

Evaluation Criteria

  • The submitted Google Colab notebooks must be publicly accessible
  • The dataset should have at least 2000 samples (training + validation)
  • The model trained must yield good results for the evaluation metric
  • The project must be documented properly using Markdown cells