Learn practical skills, build real-world projects, and advance your career

Documentation and Storytelling with Data Science Projects

alt

This tutorial attempts to answer the following questions:

  • What is documentation, and why is it important?
  • Is documenting a project really worth the time?
  • What is the right way to document a data science project?
  • What is storytelling, and how does it apply to data science projects?
  • How to add images or screenshots to a data science project?
  • What is code quality, and why is it important?
  • How to write high-quality code for a data science project?
  • How to convert experimental code into functions?
  • What is the right way to end a project?
  • When is a project finished? There's always more to do!
  • What to do if a project is taking too long?

Practice with your own project, or use this one: https://jovian.ai/aakashns/web-scraping-project-rough

Going From a Rough Notebook to a Final Notebook

It's common practice to start out with a rough notebook where you can experiment with the code and get the project into a working state. Once you've implemented all the functionality, it's a good idea to start a fresh "final" notebook which removes the experimental & messy code and replaces the with clear explanations & high-quality code.

Step 1: Notebook Title, Introduction and Outline

  • Come up with an interesting title that informs and intrigues the reader
  • Introduce the topic and the dataset and explain why this is a project worth doing
  • Provide a step-by-step outline of project to set the right expectations

Step 2: Section Headings, Descriptions & Images

  • Add section & sub-section Markdown headings matching the outline
  • Add a description at the beginning and summary at the end of each section
  • Add images, screenshots and illustrations wherever possible

Step 3: Reusable functions for each section

  • At the end of each section, create reusable function(s) to capture the functionality
  • Ensure that functions have proper inputs & outputs, and don't use global variables
  • Avoid carrying over too many variables or outputs form section to section

Step 4: Code quality and explanations for code cells

  • Use appropriate and memorable names for variables, functions and classes
  • Add comments within functions to explain what each line/block of code does
  • Add markdown cells before/after code cells explaining what the code does

Step 5: Summary, Future Work and References

  • Include a step-by-step summary of the project, show all relevant code together if feasible.
  • Include links to references, documentation, stack overflow answers etc.
  • Include ideas for future work, and encourage the reader to try it out

Step 6: (Optional) Writing a Blog Post with Embedded Code

  • Use a platform like Medium.com for writing blog posts
  • Embed code cells and outputs from Jovian notebooks into the blog post
  • Submit your blog post to a publication like Jovian, Towards Data Science etc. for greater visibility.

You can also record a video presenting your project. Check out this YouTube playlist for some examples.

import jovian
jovian.commit()
[jovian] Attempting to save notebook..