Learn data science and machine learning by building real-world projects on Jovian

## Assignment 1 - All About torch.Tensor

#### Deep Learning with PyTorch: Zero to GANs

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab (FAIR). It is free and open-source software released under the Modified BSD license. Although the Python interface is more polished and the primary focus of development, PyTorch also has a C++ interface.

A number of pieces of Deep Learning software are built on top of PyTorch, including Uber's Pyro, HuggingFace's Transformers, and Catalyst.

PyTorch provides two high-level features:

Currently, PyTorch is competing against some renowned deep learning frameworks viz., Tensorflow, Apache MXNet, etc.

PyTorch tensors

PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers. PyTorch Tensors are similar to NumPy Arrays, but can also be operated on a CUDA-capable Nvidia GPU. PyTorch supports various sub-types of Tensors.

This notebook is an attempt to explore some of the PyTorch functions which operates on tensors.

The functions explained in this notebook are:

• `torch.trace(input) → Tensor`
• `torch.tril(input, diagonal=0, out=None) → Tensor`
• `torch.tril_indices(row, col, offset=0, dtype=torch.long, device='cpu', layout=torch.strided) → Tensor`
• `torch.addbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor`
• `torch.dot(input, tensor) → Tensor`
In :
``````# Import torch and other required modules
import torch``````

### Function 1 - torch.trace(input) → Tensor

Returns the trace (i.e., sum of the elements of the diagonal) of the input 2-D matrix.

In :
``````# Example 1 - working
x = torch.arange(34., 43.,).view(3, 3)
torch.trace(x)
``````
Out:
``tensor(114.)``
1. creates a `3 * 3` tensor with values in the range 34 to 43 (43 excluded) and stores it in a variable x

2. prints the trace of the tensor x

In :
``````# Example 2 - working
torch.trace(torch.arange(404., 420.).view(4, 4))``````
Out:
``tensor(1646.)``

`One-liner` code of `torch.trace(input)` with values from 404 to 420.

In :
``````# Example 3 - breaking (to illustrate when it breaks)
x = torch.arange(0., 0.).view(0 , 0)
torch.trace(x)
# torch.trace(torch.arange(0., 0.).view(0 , 0))``````
```--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-4-eb39b5ba65d6> in <module> 1 # Example 3 - breaking (to illustrate when it breaks) 2 x = torch.arange(0., 0.).view(0 , 0) ----> 3 torch.trace(x) 4 # torch.trace(torch.arange(0., 0.).view(0 , 0)) RuntimeError: invalid argument 1: expected a matrix at /Users/distiller/project/conda/conda-bld/pytorch_1587428061935/work/aten/src/TH/generic/THTensorMoreMath.cpp:303```

The above example fails because we are passing a matrix of size `0 * 0`, which in other words, is not a matrix and torch.trace(input) expects input to be a matrix

Also, I have provided a `one-liner` code for the above example in comment because tracing is hard in `one-liner`.

Applications - Usually `tensor.trace(input)` is used to find the trace of a tensor which is in turn a matrix so, all the applications of finding trace of a matrix can be the applications of this function.

Some of them are listed here

### Function 2 - torch.tril(input, diagonal=0, out=None) → Tensor

Returns the lower triangular part (i.e., elements on and below the diagonal) of the matrix and remaining elements of matrix are set to 0.

The argument diagonal controls which diagonal to consider.

• If `diagonal = 0`, all elements on and below the main diagonal are retained.
• A `positive` value includes just as many diagonals above the main diagonal.
• A `negative` value excludes just as many diagonals below the main diagonal.
##### Parameters
• input (Tensor) – the input tensor.
• diagonal (int, optional) – the diagonal to consider
• out (Tensor, optional) – the output tensor.
In :
``````# Example 1 - working
x = torch.randn(3, 3)
torch.tril(x)
``````
Out:
``````tensor([[ 0.5205,  0.0000,  0.0000],
[ 2.0953,  0.8648,  0.0000],
[ 0.9816, -0.8634, -0.1185]])``````

All the elements on or below the main diagonal (i.e., lower triangular part of matrix) are printed as it is and remaining elements are set to `0`.

In :
``````# Example 2 - working
torch.tril(x, diagonal=-1)``````
Out:
``````tensor([[ 0.0000,  0.0000,  0.0000],
[ 2.0953,  0.0000,  0.0000],
[ 0.9816, -0.8634,  0.0000]])``````

Same as above example except that in this example elements on main diagonal are also set to `0`.

You can picture the main diagonal as `X-axis` of co-ordinate system and `diagonal=-1` as `line x=-1`, all the elements on the line and to the left of the line will be retained while others will be set to `0`.

In :
``````# Example 3 - breaking (to illustrate when it breaks)
torch.tril(x, diagonal=1, out=y)``````
```--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-7-8e4510f5e617> in <module> 1 # Example 3 - breaking (to illustrate when it breaks) ----> 2 torch.tril(x, diagonal=1, out=y) NameError: name 'y' is not defined```

The argument `out` should have a pre-defined tensor as it's value.

Applications - This function can be used for filtering out input tensors based on main diagonal.

Note - Replacing `l` in `torch.tril` with `u` will give same result but for upper triangular matrix

### Function 3 - torch.tril_indices(row, col, offset=0, dtype=torch.long, device='cpu', layout=torch.strided) → Tensor

Returns the indices of the lower triangular part of a `row * col` matrix in a `2 * N` Tensor, where the first row contains row coordinates of all indices and the second row contains column coordinates. Indices are ordered based on rows and then columns.

The argument `offset` is same as argument `diagonal` of

`torch.tril(input, diagonal=0, out=None) → Tensor`

NOTE

When running on CUDA, `row * col` must be less than `2^59` to prevent overflow during calculation.

Parameters

• row (int) – number of rows in the 2-D matrix.
• col (int) – number of columns in the 2-D matrix.
• offset (int) – diagonal offset from the main diagonal. Default: if not provided, 0.
• dtype (torch.dtype, optional) – the desired data type of returned tensor. Default: if None, torch.long.
• device (torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
• layout (torch.layout, optional) – currently only support torch.strided.
In :
``````# Example 1 - working
y = torch.tril_indices(3, 3, 1)
y
``````
Out:
``````tensor([[0, 0, 1, 1, 1, 2, 2, 2],
[0, 1, 0, 1, 2, 0, 1, 2]])``````

Read the above tensor vertically and you will get positions of retained elements of the lower triangular part of the matrix

Example, first position retained is `(0,0)`, then, `(0,1)`, then `(1,0)` and so on.

We can see that element at position `(0,2)` is set to `0`. This is done because we have given `offset=1`

In :
``````# Example 2 - working
torch.tril_indices(6, 4, 2, dtype=float)``````
Out:
``````tensor([[0., 0., 0., 1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3., 4., 4., 4.,
4., 5., 5., 5., 5.],
[0., 1., 2., 0., 1., 2., 3., 0., 1., 2., 3., 0., 1., 2., 3., 0., 1., 2.,
3., 0., 1., 2., 3.]], dtype=torch.float64)``````

Same as previous example just made this one a `one-liner` with `data_type` set to `float64`

In :
``````# Example 3 - breaking (to illustrate when it breaks)
torch.tril_indices(4, 3, 2, float)``````
```--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-10-27db1613a6d8> in <module> 1 # Example 3 - breaking (to illustrate when it breaks) ----> 2 torch.tril_indices(4, 3, 2, float) TypeError: tril_indices() takes from 2 to 3 positional arguments but 4 were given```

The error shown is self-explanatory, we need to specify the optional arguments we are providing, i.e., by default `torch.tril_indices(row, col, offset=0, dtype=torch.long, device='cpu', layout=torch.strided) → Tensor` only have 2 or 3 positional arguments, provinding more than that wihout specifying will cause `TypeError`

Applications - This functions can be used to find out the indices of retained elements for easily accessing the value of the retained elements.

Note - Replacing `l` in `torch.tril_indices` with `u` will give same result but for upper triangular matrix

### Function 4 - torch.addbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor

Performs a batch matrix-matrix product of matrices stored in batch1 and batch2, with a reduced add step (all matrix multiplications get accumulated along the first dimension). `input` is added to the final result.

batch1 and batch2 must be 3-D tensors each containing the same number of matrices.

If batch1 is a `(b×n×m)` tensor, batch2 is a `(b×m×p)` tensor, input must be broadcastable with a `(n×p)` tensor and `out` will be a `(n×p)` tensor.

``````               b-1
out= βinput + α(∑ batch1 @ batch2 )
i=0      i        i
``````

For inputs of type FloatTensor or DoubleTensor, arguments `beta` and `alpha` must be real numbers, otherwise they should be integers.

Parameters

• input (Tensor) – matrix to be added
• batch1 (Tensor) – the first batch of matrices to be multiplied
• batch2 (Tensor) – the second batch of matrices to be multiplied
• beta (Number, optional) – multiplier for `input (β)`
• alpha (Number, optional) – multiplier for `batch1 @ batch2 (α)`
• out (Tensor, optional) – the output tensor.
In :
``````# Example 1 - working
ip = torch.randn(3, 6)
batch1 = torch.randn(20, 3, 5)
batch2 = torch.randn(20, 5, 6)
``````
Out:
``````tensor([[  3.9890,  -5.1335,  -6.9768,  -4.6171,   0.8772, -10.7906],
[ -1.0072, -21.3228,   5.4748, -21.9994,   2.9079,  -9.0580],
[ 13.9873,   8.4095,   9.0454,  -1.1092,  -3.1338, -17.1364]])``````

`ip` is the input matrix `batch1` and `batch2` contains batches of 20 matrices each with `3 * 5` and `5 * 6` dimensions repectively. These matrices in the batches are filled with random values.

The function does matrix multiplication of each matrix from `batch1` and respective matrix from `batch2` and then adds the `ip` to the result and outputs it

In :
``````# Example 2 - working
ip = torch.arange(1., 5.).view(2,2)
batch1 = torch.randn(20, 2, 5)
batch2 = torch.randn(20, 5, 2)
Out:
``````tensor([[  6.5363,   3.8822],
[-12.8039,  24.2070]])``````

Same as previous example. The changes made are `ip` matrix now contains values from 1 to 5 (excluding 5) as a `2 * 2` matrix and the function is given `α = β = 2`

In :
``````# Example 3 - breaking (to illustrate when it breaks)
ip = torch.arange(1., 5.)
batch1 = torch.randn(20, 2, 5)
batch2 = torch.randn(20, 5, 2)
```--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-13-63071ffa5567> in <module> 3 batch1 = torch.randn(20, 2, 5) 4 batch2 = torch.randn(20, 5, 2) ----> 5 torch.addbmm(ip, batch1, batch2, alpha=3, beta=2) RuntimeError: The expanded size of the tensor (2) must match the existing size (4) at non-singleton dimension 1. Target sizes: [2, 2]. Tensor sizes: ```

The `ip` matrix is a linear matrix while the mattrix obtained as a result of matrix multiplication of matrices in `batch 1` and `batch 2` is a `2 * 2` matrix.

Due to the difference in dimensions of matrices to be added the example shows an error.

In :
``````# Example 4 - breaking (to illustrate when it breaks)
ip = torch.arange(1., 5.).view(2, 2)
batch1 = torch.randn(20, 2, 5)
batch2 = torch.randn(20, 5, 6)
```--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-14-342b3b13e72d> in <module> 3 batch1 = torch.randn(20, 2, 5) 4 batch2 = torch.randn(20, 5, 6) ----> 5 torch.addbmm(ip, batch1, batch2, alpha=3, beta=2) RuntimeError: The expanded size of the tensor (6) must match the existing size (2) at non-singleton dimension 1. Target sizes: [2, 6]. Tensor sizes: [2, 2]```

The matrices in `batch 1` are of dimension `2 * 5` and in `batch 2` are of `5 * 6` and matrix multiplication needs the `col` value of first matrix to be equal to `row` value of second matrix.

Due to the difference in value of `col` of first matrix and value of `row` of second matrix the example shows an error.

In :
``````# Example 5 - breaking (to illustrate when it breaks)
ip = torch.arange(1., 5.).view(2, 2)
batch1 = torch.randn(20, 2, 5)
batch2 = torch.randn(120, 5, 2)
```--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-15-698a4e4ed66e> in <module> 3 batch1 = torch.randn(20, 2, 5) 4 batch2 = torch.randn(120, 5, 2) ----> 5 torch.addbmm(ip, batch1, batch2, alpha=3, beta=2) RuntimeError: invalid argument 2: equal number of batches expected, got 20, 120 at /Users/distiller/project/conda/conda-bld/pytorch_1587428061935/work/aten/src/TH/generic/THTensorMath.cpp:342```

The no. of matrices in each `batch` is not same so, the matrix multiplication is not possible, and hence we see the error.

Application - It seems to be a use specific function and is used when you need to do addition of a matrix to result of matrix multiplication of matrices in 2 or more batches.

### Function 5 - torch.dot(input, tensor) → Tensor

Computes the dot product (inner product) of two tensors.

NOTE

In :
``````x = torch.tensor([3, 5])
y = torch.tensor([2, 7])
torch.dot(x, y)
``````
Out:
``tensor(41)``

Calculates dot product of 2 input tensors - `x` and `y`

In :
``````# Example 2 - working
torch.dot(torch.tensor([12, 15]), torch.tensor([20, 10]))``````
Out:
``tensor(390)``

`One-liner` for calculating dot product of 2 tensors.

In :
``````# Example 3 - breaking (to illustrate when it breaks)
torch.dot(torch.tensor([7, 3, 5]), torch.tensor([9, 3]))``````
```--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-18-0e09f76c4688> in <module> 1 # Example 3 - breaking (to illustrate when it breaks) ----> 2 torch.dot(torch.tensor([7, 3, 5]), torch.tensor([9, 3])) RuntimeError: inconsistent tensor size, expected tensor  and src  to have the same number of elements, but got 3 and 2 elements respectively```

The error shown is pretty clear in itself, both the tensors needs to be of same size to calculate there dot product.

Applications - Used to calculate dot product of 2 tensors.

### Conclusion

5 of the many different functions availablein PyTorch were covered in this notebook. 4 of them were related to matrices and last one was related to dot product.

``!pip install jovian --upgrade --quiet``
``import jovian``
``jovian.commit()``
```[jovian] Attempting to save notebook.. ```
`` ``