This tutorial series is a hands-on beginner-friendly introduction to deep learning using PyTorch, an open-source neural networks library. These tutorials take a practical and coding-focused approach. The best way to learn the material is to execute the code and experiment with it yourself. Check out the full series here:
This tutorial covers the following topics:
If you're just getting started with data science and deep learning, then this tutorial series is for you. You just need to know the following:
We'll cover any additional mathematical and theoretical concepts we need as we go along.
This tutorial is an executable Jupyter notebook hosted on Jovian (don't worry if these terms seem unfamiliar; we'll learn more about them soon). You can run this tutorial and experiment with the code examples in a couple of ways: using free online resources (recommended) or on your computer.
The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Colab. Google Colab is a free online platform for running Jupyter notebooks using Google's cloud infrastructure. You can also select "Run on Binder" or "Run on Kaggle" if you face issues running the notebook on Google Colab.
To run the code on your computer locally, you'll need to set up Python, download the notebook and install the required libraries. We recommend using the Conda distribution of Python. Click the Run button at the top of this page, select the Run Locally option, and follow the instructions.
Jupyter Notebooks: This tutorial is a Jupyter notebook - a document made of cells. Each cell can contain code written in Python or explanations in plain English. You can execute code cells and view the results, e.g., numbers, messages, graphs, tables, files, etc. instantly within the notebook. Jupyter is a powerful platform for experimentation and analysis. Don't be afraid to mess around with the code & break things - you'll learn a lot by encountering and fixing errors. You can use the "Kernel > Restart & Clear Output" or "Edit > Clear Outputs" menu option to clear all outputs and start again from the top.
Before we begin, we need to install the required libraries. The installation of PyTorch may differ based on your operating system / cloud environment. You can find detailed installation instructions here: https://pytorch.org .
# Uncomment and run the appropriate command for your operating system, if required
# Linux / Binder
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
# Windows
# !pip install numpy torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
# MacOS
# !pip install numpy torch torchvision torchaudio
Let's import the torch
module to get started.
import torch
At its core, PyTorch is a library for processing tensors. A tensor is a number, vector, matrix, or any n-dimensional array. Let's create a tensor with a single number.
# Number
t1 = torch.tensor(4.)
t1
tensor(4.)
4.
is a shorthand for 4.0
. It is used to indicate to Python (and PyTorch) that you want to create a floating-point number. We can verify this by checking the dtype
attribute of our tensor.
t1.dtype
torch.float32
Let's try creating more complex tensors.
# Vector
t2 = torch.tensor([1., 2, 3, 4])
t2
tensor([1., 2., 3., 4.])
# Matrix
t3 = torch.tensor([[5., 6],
[7, 8],
[9, 10]])
t3
tensor([[ 5., 6.],
[ 7., 8.],
[ 9., 10.]])
# 3-dimensional array
t4 = torch.tensor([
[[11, 12, 13],
[13, 14, 15]],
[[15, 16, 17],
[17, 18, 19.]]])
t4
tensor([[[11., 12., 13.],
[13., 14., 15.]],
[[15., 16., 17.],
[17., 18., 19.]]])
Tensors can have any number of dimensions and different lengths along each dimension. We can inspect the length along each dimension using the .shape
property of a tensor.
print(t1)
t1.shape
tensor(4.)
torch.Size([])
print(t2)
t2.shape
tensor([1., 2., 3., 4.])
torch.Size([4])
print(t3)
t3.shape
tensor([[ 5., 6.],
[ 7., 8.],
[ 9., 10.]])
torch.Size([3, 2])
print(t4)
t4.shape
tensor([[[11., 12., 13.],
[13., 14., 15.]],
[[15., 16., 17.],
[17., 18., 19.]]])
torch.Size([2, 2, 3])
Note that it's not possible to create tensors with an improper shape.
# Matrix
t5 = torch.tensor([[5., 6, 11],
[7, 8],
[9, 10]])
t5
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-83912cf67c5e> in <module>
1 # Matrix
----> 2 t5 = torch.tensor([[5., 6, 11],
3 [7, 8],
4 [9, 10]])
5 t5
ValueError: expected sequence of length 3 at dim 1 (got 2)
A ValueError
is thrown because the lengths of the rows [5., 6, 11]
and [7, 8]
don't match.
We can combine tensors with the usual arithmetic operations. Let's look at an example:
# Create tensors.
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)
x, w, b
(tensor(3.), tensor(4., requires_grad=True), tensor(5., requires_grad=True))
We've created three tensors: x
, w
, and b
, all numbers. w
and b
have an additional parameter requires_grad
set to True
. We'll see what it does in just a moment.
Let's create a new tensor y
by combining these tensors.
# Arithmetic operations
y = w * x + b
y
tensor(17., grad_fn=<AddBackward0>)
As expected, y
is a tensor with the value 3 * 4 + 5 = 17
. What makes PyTorch unique is that we can automatically compute the derivative of y
w.r.t. the tensors that have requires_grad
set to True
i.e. w and b. This feature of PyTorch is called autograd (automatic gradients).
To compute the derivatives, we can invoke the .backward
method on our result y
.
# Compute derivatives
y.backward()
The derivatives of y
with respect to the input tensors are stored in the .grad
property of the respective tensors.
# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)
dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)
As expected, dy/dw
has the same value as x
, i.e., 3
, and dy/db
has the value 1
. Note that x.grad
is None
because x
doesn't have requires_grad
set to True
.
The "grad" in w.grad
is short for gradient, which is another term for derivative. The term gradient is primarily used while dealing with vectors and matrices.
Apart from arithmetic operations, the torch
module also contains many functions for creating and manipulating tensors. Let's look at some examples.
# Create a tensor with a fixed value for every element
t6 = torch.full((3, 2), 42)
t6
tensor([[42, 42],
[42, 42],
[42, 42]])
# Concatenate two tensors with compatible shapes
t7 = torch.cat((t3, t6))
t7
tensor([[ 5., 6.],
[ 7., 8.],
[ 9., 10.],
[42., 42.],
[42., 42.],
[42., 42.]])
# Compute the sin of each element
t8 = torch.sin(t7)
t8
tensor([[-0.9589, -0.2794],
[ 0.6570, 0.9894],
[ 0.4121, -0.5440],
[-0.9165, -0.9165],
[-0.9165, -0.9165],
[-0.9165, -0.9165]])
# Change the shape of a tensor
t9 = t8.reshape(3, 2, 2)
t9
tensor([[[-0.9589, -0.2794],
[ 0.6570, 0.9894]],
[[ 0.4121, -0.5440],
[-0.9165, -0.9165]],
[[-0.9165, -0.9165],
[-0.9165, -0.9165]]])
You can learn more about tensor operations here: https://pytorch.org/docs/stable/torch.html . Experiment with some more tensor functions and operations using the empty cells below.
Numpy is a popular open-source library used for mathematical and scientific computing in Python. It enables efficient operations on large multi-dimensional arrays and has a vast ecosystem of supporting libraries, including:
If you're interested in learning more about Numpy and other data science libraries in Python, check out this tutorial series: https://jovian.ai/aakashns/python-numerical-computing-with-numpy .
Instead of reinventing the wheel, PyTorch interoperates well with Numpy to leverage its existing ecosystem of tools and libraries.
Here's how we create an array in Numpy:
import numpy as np
x = np.array([[1, 2], [3, 4.]])
x
array([[1., 2.],
[3., 4.]])
We can convert a Numpy array to a PyTorch tensor using torch.from_numpy
.
# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)
y
tensor([[1., 2.],
[3., 4.]], dtype=torch.float64)
Let's verify that the numpy array and torch tensor have similar data types.
x.dtype, y.dtype
(dtype('float64'), torch.float64)
We can convert a PyTorch tensor to a Numpy array using the .numpy
method of a tensor.
# Convert a torch tensor to a numpy array
z = y.numpy()
z
array([[1., 2.],
[3., 4.]])
The interoperability between PyTorch and Numpy is essential because most datasets you'll work with will likely be read and preprocessed as Numpy arrays.
You might wonder why we need a library like PyTorch at all since Numpy already provides data structures and utilities for working with multi-dimensional numeric data. There are two main reasons:
We'll leverage both these features of PyTorch extensively in this tutorial series.
Whether you're running this Jupyter notebook online or on your computer, it's essential to save your work from time to time. You can continue working on a saved notebook later or share it with friends and colleagues to let them execute your code. Jovian offers an easy way of saving and sharing your Jupyter notebooks online.
First, you need to install the Jovian python library if it isn't already installed.
!pip install jovian --upgrade --quiet
import jovian
jovian.commit(project='01-pytorch-basics')
[jovian] Attempting to save notebook..
[jovian] Updating notebook "aakashns/01-pytorch-basics" on https://jovian.ai/
[jovian] Uploading notebook..
[jovian] Capturing environment..
[jovian] Committed successfully! https://jovian.ai/aakashns/01-pytorch-basics
The first time you run
jovian.commit
, you may be asked to provide an API Key to securely upload the notebook to your Jovian account. You can get the API key from your Jovian profile page after logging in / signing up.
jovian.commit
uploads the notebook to your Jovian account, captures the Python environment, and creates a shareable link for your notebook, as shown above. You can use this link to share your work and let anyone (including you) run your notebooks and reproduce your work. Jovian also includes a powerful commenting interface, so you can discuss & comment on specific parts of your notebook:
You can do a lot more with the jovian
Python library. Visit the documentation site to learn more: https://jovian.ai/docs/index.html
Try out this assignment to learn more about tensor operations in PyTorch: https://jovian.ai/aakashns/01-tensor-operations
This tutorial covers the following topics:
You can learn more about PyTorch tensors here: https://pytorch.org/docs/stable/tensors.html.
The material in this series is inspired by:
With this, we complete our discussion of tensors and gradients in PyTorch, and we're ready to move on to the next topic: Gradient Descent & Linear Regression.
Try answering the following questions to test your understanding of the topics covered in this notebook:
torch
module?dtype
property of a tensor represent?[[1, 2, 3], [4, 5]]
? Why or why not?requires_grad=True
while creating a tensor? Illustrate with an example.backward
method of a tensor?torch
module for creating tensors.torch
module for performing mathematical operations on tensors.jovian.commit
?