Sign In

For my project, I'm doing the Default Final Project of CS 224N:

Question Answering on SQuAD 2.0


There are several reasons:

  • I want in the future do (maybe project nr2?) a model, that based on the input paragraph can generate Question-Answering Pairs.

    • By doing this Question Answering Project, I think I get a more intuitive understanding of NLP concepts
    • I hope that some of the modules from this project I can re-use in future
  • In this project, they give student baseline model, along with some other code, that I don't see often in tutorials or courses, which inclues:

    • logging experiments in Tensorboard
    • making checkpoint along the way (this is HUGE for me, because I will be using GPU in my local machine, so I can stop training and restart it later)
    • providing scripts for training, testing
    • basicly showing good coding practice
    • I saw/read slides from this presentation Writing Code for NLP Research and it "checks" most of the boxes
  • I think the practices mentioned above will be useful for all my future projects.

  • It provides Squad 2.0 Leaderboard, which kind of like Kaggle, can guide me, telling me if I'm going in a good direction.

What I already did:

  • I installed the enviroment for this task, downloaded code.
  • First obstacle was being able to train the baseline on my local machine. I got out-of-memory error, but for RAM, not GPU. Embeddings created using GloVe where to big for my PC (I got 16 GB RAM). So I looked at pre-processing process, and found that changing parameter responsible for "Max number of words in a paragraph" from 400 to 200 enables me to train the baseline model on mine machine. I plan to do lot of experiments on my local machine, then, when I'm ready, I'll re-train model in cloud without restrictions.
  • I read about metrics used in SQaD, and what is being judged.
  • I watched lecture dedicated to Question Answering Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 10 – Question Answering


The Baseline model is based on a model from 2017 (BiAF), so there is lot of room for improvement.

As what to do after adapting BERT. Theere are several things I can do to improve model, mentioned in handout provided by CS 224n staff:

Pre-trained Contextual Embeddings (PCE), aka ELMo & BERT

  • ELMo
  • BERT

Non-PCE Model Types

  • Character-level Embeddings
  • Self-attention
  • Transformers
  • Transformer-XL
  • Additional input features

More models and papers

  • Regularization
  • Sharing weights
  • Word vectors
  • Combining forward and backward states
  • Types of RNN
  • Model size and number of layers.
  • Optimization algorithms
  • Ensembling
  • Parameters - Experiment

If I manage to get a good model in 4 weeks, during last week I'll try do fine-tune the model on a different dataset (transfer learning), or I'll try to make API out of it and deploy it on server (not sure if API-making is doable in one week).

In [1]:
!pip install jovian -q
In [2]:
import jovian
In [ ]:
[jovian] Saving notebook..
In [ ]: