Learn practical skills, build real-world projects, and advance your career

Zero to Gan: Image Classification with Pytorch

In this Zero to Gan - deep learning project, we will use the Pytorch to implement the image classification. Let us start our project with some basic introduction about deep learning and about its concept that use in our project. In our project we will use the IntelImage dataset for image classification.

Steps we follow in our projects are as:

  1. Pick a dataset.
  2. Download the dataset.
  3. Import the dataset using pytorch.
  4. Explore the dataset.
  5. Prepare the dataset for training.

Introduction to Deep Learning

Deep learning is a subset of machine learning that uses multi-layer neural networks (large neural networks), inspired by the biological structure of the human brain, where neurons in a layer receive some input data, process it, and send the output to the following layer. These neural networks can consist of thousands of interconnected nodes (neurons), mostly organized in different layers, where one node is connected to several nodes in the previous layer from where it receives its input data, as well as being connected to several nodes in the following layer, to which it sends the output data after it has been processed.

Deep learning's popularity is due to its accuracy. It has achieved higher accuracy levels than other algorithms have ever before for complex data problems such as natural language processing (NLP). Deep learning's ability to perform outstandingly well has reached levels where machines can outperform humans, such as in the case of fraud detection.

The following plot shows the performance of deep learning against other algorithms in terms of the quantity of data:
Figure:

image.png

Applications of Deep Learning

  • Self-driving vehicles: Several companies, such as Google, have been working on the development of partially or totally self-driving vehicles that learn to drive by using digital sensors to identify the objects around them.

  • Medical diagnosis: Deep learning is impacting this industry by improving the diagnosis accuracy of terminal diseases such as brain and breast cancer. This is done by classifying X-rays (or any other diagnostic imagery mechanisms) of new patients, based on labeled X-rays from previous patients that did or did not have cancer.

  • Voice assistants: This may be one of the most popular applications nowadays, due to the proliferation of different voice-activated intelligent assistants, such as Apple's Siri, Google Home, and Amazon's Alexa.

  • Automatic text generation: This means generating new text based on an input sentence. This is popularly used in email writing, where the email provider suggests the next couple of words to the user, based on the text that's already been written.

  • Advertising: In the commercial world, deep learning is helping to increase the return on investment of advertising campaigns by targeting the right audiences and by creating more effective ads. One example of this is the generation of content in order to produce up-to-date and informative blogs that help to engage current customers and attract new ones.

  • Price forecasting: For beginners, this is a typical example of what can be achieved through the use of machine learning algorithms. Price forecasting consists of training a model based on real data. For instance, in the field of real estate, this would consist of feeding a model with property characteristics and their final price in order to be able to predict the prices of future entries based solely on property characteristics.

Introduction to PyTorch

PyTorch is an open source library developed mainly by Facebook's artificial intelligence research group as a Python version of Torch.

PyTorch was first released to the public in January 2017. It uses the power of GPUs to speed up the computation of tensors, which accelerates the training times of complex models.

The library has a C++ backend, combined with the deep learning framework of Torch, which allows much faster computations than native Python libraries with many deep learning features. The frontend is in Python, which has helped it gain popularity, enabling data scientists new to the library to construct complex neural networks. It is
possible to use PyTorch alongside other popular Python packages.

Although the PyTorch is fairly new, it has gained popularity quickly as it was developed using feedback from many experts in the field. This has led PyTorch to become a useful library for users.

GPUs in PyTorch

GPUs were originally developed to speed up computations in graphics rendering, especially for video games and such. However, they have become increasingly popular lately thanks to their ability to help speed up computations for any field, including deep learning calculations.

There are several platforms that allow the allocation of variables to the GPUs of a machine, with the Compute Unified Device Architecture (CUDA) being one of the most commonly used platforms. CUDA is a computing platform developed by Nvidia that speeds up compute-intensive programs thanks to the use of GPUs to perform
computations.

In PyTorch, the allocation of variables to CUDA can be done through the use of the torch.cuda package, as shown in the following code snippet:

image.png

Here, the first line of code creates a tensor filled with random integers (between 0 and 10). The second line of code allocates that tensor to CUDA so that all computations involving that tensor are handled by the GPU instead of the CPU. To allocate a variable back to the CPU, use the following code snippet:

image.png

In CUDA, when solving a deep learning data problem, it is good practice to allocate the model holding the network architecture, as well as the input data. This will ensure that all computations carried out during the training process are handled by the GPU.

Nevertheless, this allocation can only be done given that our machine has a GPU available and that we have installed PyTorch with the CUDA package. To verify whether we are able to allocate our variables in CUDA, we use the following
code snippet:

image.png

If the output from the preceding line of code is True, we are all set to start allocating your variables in CUDA.