Learn practical skills, build real-world projects, and advance your career
Updated 2 years ago
Documentation and Storytelling with Data Science Projects
This tutorial attempts to answer the following questions:
- What is documentation, and why is it important?
- Is documenting a project really worth the time?
- What is the right way to document a data science project?
- What is storytelling, and how does it apply to data science projects?
- How to add images or screenshots to a data science project?
- What is code quality, and why is it important?
- How to write high-quality code for a data science project?
- How to convert experimental code into functions?
- What is the right way to end a project?
- When is a project finished? There's always more to do!
- What to do if a project is taking too long?
Practice with your own project, or use this one: https://jovian.ai/aakashns/web-scraping-project-rough
Going From a Rough Notebook to a Final Notebook
It's common practice to start out with a rough notebook where you can experiment with the code and get the project into a working state. Once you've implemented all the functionality, it's a good idea to start a fresh "final" notebook which removes the experimental & messy code and replaces the with clear explanations & high-quality code.
Step 1: Notebook Title, Introduction and Outline
- Come up with an interesting title that informs and intrigues the reader
- Introduce the topic and the dataset and explain why this is a project worth doing
- Provide a step-by-step outline of project to set the right expectations
Step 2: Section Headings, Descriptions & Images
- Add section & sub-section Markdown headings matching the outline
- Add a description at the beginning and summary at the end of each section
- Add images, screenshots and illustrations wherever possible
Step 3: Reusable functions for each section
- At the end of each section, create reusable function(s) to capture the functionality
- Ensure that functions have proper inputs & outputs, and don't use global variables
- Avoid carrying over too many variables or outputs form section to section
Step 4: Code quality and explanations for code cells
- Use appropriate and memorable names for variables, functions and classes
- Add comments within functions to explain what each line/block of code does
- Add markdown cells before/after code cells explaining what the code does
Step 5: Summary, Future Work and References
- Include a step-by-step summary of the project, show all relevant code together if feasible.
- Include links to references, documentation, stack overflow answers etc.
- Include ideas for future work, and encourage the reader to try it out
Step 6: (Optional) Writing a Blog Post with Embedded Code
- Use a platform like Medium.com for writing blog posts
- Embed code cells and outputs from Jovian notebooks into the blog post
- Submit your blog post to a publication like Jovian, Towards Data Science etc. for greater visibility.
You can also record a video presenting your project. Check out this YouTube playlist for some examples.
import jovian
jovian.commit()
[jovian] Attempting to save notebook..