Jovian
⭐️
Sign In

Deep Learning Workbook

Source: https://github.com/aakashns/deep-learning-workbook

This Jupyter notebook outlines a universal blueprint that can be used to attack and solve any machine learning problem. It is based on the workflow described in the book Deep Learning with Python.

Usage Instructions

  1. Set up your dev environment with Jupyter, Tensorflow & Keras (or any other ML framework). Follow this guide if you wish to use a GPU on AWS.

  2. Download the latest version of the workbook using the command:

wget https://raw.githubusercontent.com/aakashns/deep-learning-workbook/master/deep-learning-workbook.ipynb
  1. Change the file name, title and kernel as desired. This notebook was originally written with the kernel conda:tensorflow_p36 on the AWS Deep Learning AMI.

  2. Follow the steps described below, filling in the blanks (marked as TODO).

  3. Once you're done building the final model, you can delete the cells containing instructions (like this one).

Step 1: Define the Problem & Collect Data

Define the problem and assemble a dataset:

  • What will your input data be? What are you trying to predict?
  • What type of problem are you facing?
    • Binary classification
    • Multi-class classification
    • Scalar regression
    • Vector regression
    • Multi-class, multi-label classification
    • Clustering
    • Generation
    • Reinforcement learning

Be aware of the hypotheses you are making at this stage:

  • You are hypothesizing that your outputs can be predicted given your inputs
  • You are hypothesizing that your available data is sufficiently informative to learn the relationship between inputs and outputs.

Remember that machine learning can only be used to memorize patterns which are present in your training data. You can only recognize what you have seen before.

Answer the following questions to define your problem:

Q: What are you trying to predict?
A: TODO

Q: What will your input data be?
A: TODO

Q: What type of problem are you facing?
A: TODO

Q: What is the size of your dataset?
A: TODO

Step 2: Pick Success Metrics

To achieve, you must define what you mean by success, E.g.:

  • Accuracy
  • Precision-Recall
  • Customer retention rate
  • ROC AUC

Tip: Browse Kaggle to find examples of problems and evaluation metrics. For a list of metrics supported by Keras, visit https://keras.io/metrics/ .

Answer these questions to define your success metrics:

Q: What is your metric for success?
A: TODO (e.g. accuracy)

Q: What value of your success metric are you aiming for?
A: TODO (e.g. 95 %)

Step 3: Pick an Evaluation Protocol and Prepare Your Data

There are three common evaluation protocols:

  1. Maintaining a hold-out validation set; this is the way to go if you have plenty of data
  2. Doing K-fold cross validation; when you have too few samples for hold out validation to be reliable
  3. Doing iterated K-fold validation; this is for performing highly accurate model evaluation when little is available.

Prepare your data based on the evaluation protocol:

  • The data should be formatted as tensors
  • The values taken by the tensors should typically be scaled to small values, e.g. in the range [-1,1] or [0,1]
  • If different features take values in different ranges, then the data should be normalized.
  • Some feature engineering might be required for small data problems.

Answer these questions to determine your evaluation protocol and prepare your data:

Q: What approach are going to follow for validation?
A: TODO e.g. K-fold cross validation

Q: Does your data require reformatting (into tensors), normalization or scaling?
A: TODO

Q: What is the training/validation/test split? A: TODO e.g. 50-25-25

Q: Can/should the data be randomized before splitting?
A: TODO

Q: Can you come up with new features using existing ones to make the problem easier?
A: TODO

In [5]:
# Implement Step 3 (load, prepare & split the data)
In [ ]:
 
In [ ]:
 
In [ ]:
 

Step 4: Develop the First Model

The first goal is to develop a model that is capable of beating a dumb baseline. There are 3 key choices to be made:

  1. Last-layer activation function
  2. Loss function
  3. Optimization configuration (generally, rmsprop is good enough).
Problem Type Last-layer Activation Loss Function
Binary Classification `sigmoid` `binary_crossentropy`
Multi-class, single-label classification `softmax` `categorical_crossentropy`
Multi-class, multi-label classification `sigmoid` `binary_crossentropy`
Regression to arbitrary values None `mse`
Regression to values in `[0,1]` `sigmoid` `mse` or `binary_crossentropy`
In [1]:
# TODO: Implement Step 4 here
In [ ]:
 
In [ ]:
 
In [ ]:
 

Step 5: Develop a Model That Overfits

The final objective is to find the balance between:

  • Optimization and generalization
  • Under-fitting and over-fitting
  • Under-capacity and over-capacity

To figure out how big a model is required, you must develop a model that overfits, using one or more of the following approaches:

  • Add layers
  • Makes layers bigger
  • Train for more epochs

Plot the values of the loss function and the success metrics on the training and validation datasets to identify where the model starts over-fitting.

In [3]:
# TODO: Implement Step 5 Here
In [ ]:
 
In [ ]:
 
In [ ]:
 

Step 6: Regularize the Model and Tune the Hyperparameters

This part will take the most time. You will repeatedly modify your model, train it, evaluate on your validation data, modify it again... until your model is as good as it can get.

Following are some approaches for improving the model:

  • Add dropout
  • Try different architectures, add or remove layers
  • Add L1 / L2 regularization
  • Try different hyperparameters to find the optimal cofiguration, e.g.:
    • No. of units per layer
    • Learning rate of the optimizer
  • Iterate on feature engineering: add new features, remove features that do not seem informative

Be mindful of the following: every time you are using feedback from your validation process in order to tune your model, you are leaking information about your validation process into your model. This can cause overfitting on the validation data.

Once you have developed a seemingly good enough model configuration, you can train your final production model on all data available (training and validation) and evaluate it one last time on the test set. Finally, you can save your model to disk, so that it can be reused later.

In [6]:
# TODO: Implement Step 6 here
In [ ]: