This content originally appeared on DEV Community and was authored by Steven Mathew
Here is breakdown of running a simple convolutional neural network on Google Colab with its T4 GPU (free version).
We will be using pytorch for training this Convolutional Neural Network.
Basics First:
A Convolutional Neural Network (CNN) is a type of artificial neural network specifically designed for processing structured grid data, such as images.
It has the following:
Neurons: Basic units that process information.
Layers: Groups of neurons stacked together and the information passes through these layers, getting transformed at each step.
Convolution: This is like sliding a small window (called a filter or kernel) over an image and looking for specific patterns.
Filters: These are the small windows that detect specific features like edges, textures, or shapes in the images.
Pooling: This reduces the size of the data (image) while keeping important information
After going through multiple convolutional and pooling layers, the information becomes a stretched-out vector and then passed into fully connected layers.
These layers are the classic parts of a neural network that ultimately determine the outcome, such as recognizing the content of an image (for example, whether it's a cat, dog, or car).
STEPS:
2) Check If GPU is present and available for our use:
`import torch
print("PyTorch version:", torch.version)
if torch.cuda.is_available():
print("GPU is available for PyTorch!")
else:
print("No GPU found for PyTorch.")`
This is let you know if GPU is available for usage or not.
3) We will load libraries:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
Explanation:
import torch: Main PyTorch library
torch.nn: Contains neural network components
torch.optim: Optimization algorithms
dataset, transforms -> torchhvision: Handling & Datasets
Dataloader: Efficiently load data in batches
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
Explanation:
transforms.Compose: Composing several transforms together.
But what are composing and transformation?
Transforms: These are operations applied to images, such as resizing, cropping, rotating, normalizing, converting to tensors, etc.
Compose: This is a method to combine several of these transform operations into a single operation.
transforms.ToTensor(): : Converts a PIL Image or NumPy array to a PyTorch tensor and scales the image pixel values to the range [0, 1]
What in the heaven does that mean???
Converts a PIL Image or NumPy array to a PyTorch tensor:
PIL Image: This is an image format provided by the Python Imaging Library (PIL), often used for loading and processing images.
NumPy array: This is a format provided by the NumPy library, often used for numerical operations in Python.
PyTorch tensor: This is the *data format used in PyTorch * for all operations, particularly useful for GPU acceleration.
Example:
PIL Image -> Tensor
A PIL image might look like this: PIL.Image.Image image mode=RGB size=256x256 at 0x7F8B9C4CBB80
After applying transforms.ToTensor(), it converts the image to a PyTorch tensor: torch.Size([3, 256, 256])
The tensor shape [3, 256, 256] indicates that the image has 3 color channels (Red, Green, Blue), and each channel is 256x256 pixels.
Numpy Array -> Tensor
A NumPy array might look like this: array([[[255, 0, 0], ..., [0, 0, 255]]], dtype=uint8)
After applying transforms.ToTensor(), it converts the array to a PyTorch tensor with the same shape but scaled pixel values.
Scales the image pixel values to the range [0, 1]:
Image pixel values in a PIL Image or NumPy array typically range from 0 to 255 for each color channel (Red, Green, Blue).
Example:
Before Scaling:
Pixel values in a typical image range from 0 to 255.
For example, a pixel value of 255 represents full intensity (white) and 0 represents no intensity (black).
After Scaling:
Each pixel value is divided by 255, so the new range is [0, 1].
For example, a pixel value of 255 becomes 1.0, and a pixel value of 0 remains 0.0.
transforms.ToTensor() divides each pixel value by 255, converting the range to [0, 1].
This scaling is important for neural networks because it helps in stabilizing the training process and improving the convergence.
EXAMPLE CODE:
`from PIL import Image
import numpy as np
import torchvision.transforms as transforms
Load a PIL image
pil_image = Image.open('path_to_image.jpg')
Convert to PyTorch tensor and scale pixel values
transform = transforms.ToTensor()
tensor_image = transform(pil_image)
print(tensor_image.shape)
print(tensor_image.min(), tensor_image.max())`
OUTPUT
Output: torch.Size([3, height, width])
Output: 0.0 1.0
This content originally appeared on DEV Community and was authored by Steven Mathew
Steven Mathew | Sciencx (2024-06-17T12:16:22+00:00) A Simple Convolutional neural network (CNN) on Google Colab?. Retrieved from https://www.scien.cx/2024/06/17/a-simple-convolutional-neural-network-cnn-on-google-colab/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.