The building blocks of DL 2
Simplest Neural Network - linear layer with activation function
In the previous post I've explained what is the most important concept in neural networks - technique that allows us to incrementally find minimum of a function. This is called Gradient Descent algorithm!
In this post I will build on this concept and show you how to create a basic linear model to predict what is on a medical image!
Download data
As in the first post showing how to build medical image recognition with pure statistics, we need to download data first. For some basic description of how the data looks like, see that first post first ;)
! git clone https://github.com/apolanco3225/Medical-MNIST-Classification.git
! rm -rf ./medical_mnist
! mv Medical-MNIST-Classification/resized/ ./medical_mnist
! rm -rf Medical-MNIST-Classification
from pathlib import Path
PATH = Path("medical_mnist/")
We have much more powerful tools now, so we will deal with all 6 classes now, but first need to prepare data:
- all data needs to be numerical
- it needs to be in arrays
- it needs to be labeled
classes = [cls.name for cls in PATH.iterdir()]
classes
images = {}
for cls in classes:
images[cls] = list((PATH/cls).iterdir())
from PIL import Image
import torch
from torchvision.transforms import ToTensor
image_tensors = {}
for cls in classes:
image_tensors[cls] = torch.stack( # converts iterable of tensors into higher dimention single tensor
[
ToTensor()( # converts images to tensors
Image.open(path)
).view(-1, 64 * 64).squeeze().float()/255 # reshape tensor from 64x64 to vector tensor of size 4096 and convert values
for path in (PATH/cls).iterdir()]
)
so let's see what we got there
for cls in classes:
class_shape = image_tensors[cls].shape
print(f"{cls} has {class_shape[0]} images of a size {class_shape[1:]}")
x_train = torch.cat([image_tensors[cls] for cls in classes], dim=0)
y_train = torch.cat([torch.tensor([index] * image_tensors[cls].shape[0]) for index, cls in enumerate(classes)])
shuffle the dataset. This is important as if we don't do this, images from the classes that we train first will be effectively not as "fresh" in the memmory of the network
permutations = torch.randperm(x_train.shape[0])
x_train = x_train[permutations]
y_train = y_train[permutations]
create validation set that is 20% of the training set - this is important to asses the performance of the model. Validation set doesn't take part in training, so model is not biased towards those images (it cannot remember those exact images from training). This is essential to see how well model generalizes on examples it has not seen.
valid_pct = 0.2
valid_index = int(x_train.shape[0] * valid_pct)
valid_index
x_valid = x_train[:valid_index]
y_valid = y_train[:valid_index]
x_train = x_train[valid_index:]
y_train = y_train[valid_index:]
x_train.shape, y_train.shape, x_valid.shape, y_valid.shape
Now we need to build the model. Let's use the most basic building blocks of the neural network - linear layer (linear_layer
function) and nonlinearity (softmax
function).
def softmax(x):
return x - x.exp().sum(-1).unsqueeze(-1)
def linear_layer(x):
return x @ weights + bias
the model is a simple function composition (results of linear layer and fed into nonlinear layer (softmax
)
def model(x):
return softmax(linear_layer(x))
For the Gradient Descent to work we also need to specify the loss function - this is crucial, as this is the function on which we compute gradients for our parameters. Just a quick recap: Gradient Descent algorithm finds out the values to change function parameters so the function values decreese.
In your case we minimize loss_func
. Parameters of this function (passed in a form of preds
) are in the model
: wegiths
and bias
. Gradient Descent will give us values to change each of those parameters so we minimize the loss_func
def loss_func(preds, targets):
return -preds[range(targets.shape[0]), targets].mean()
def accuracy(preds, targets):
return (torch.argmax(preds, dim=-1) == targets).float().mean()
And here is the Gradient Descent loop - see comments in the code for details
%%time
# number of training examples
n = x_train.shape[0]
# batch size - this is necessary as we won't be able to fit all
# the examples into the memmory, so we need to do the computations in batches
bs = 64
# how many epochs to train for
epochs = 15
weights = torch.zeros((64 * 64, 10), requires_grad=True) # define weights matrix
bias = torch.zeros(10, requires_grad=True) # and bias term
# in each of those epochs algorithm sees all the images. So in this case
# we see all the images 15 times
for epoch in range(epochs):
# here is the loop for batches: in each batch we:
# - see 64 images
# - compute predictions based on the model
# - compute the loss
# - compute gradients and update parameters (wegiths and bias)
for i in range((n - 1) // bs + 1):
# select images for this batch
start_i = i * bs
end_i = start_i + bs
xb = x_train[start_i:end_i]
yb = y_train[start_i:end_i]
# compute predictions
preds = model(xb)
# compute loss
loss = loss_func(preds, yb)
# compute gradients (this is done for us by PyTorch with this backwards function!)
loss.backward()
# this block is necessary, so computations we do below, are not taken into account when
# computing next gradients
with torch.no_grad():
# update parameters
weights -= weights.grad
bias -= bias.grad
# zero out the gradients so they are ready for the next batch (otherwise they accumulate values)
weights.grad.zero_()
bias.grad.zero_()
# eventually after each epoch (seeing all the images) we print out how we did
print(f"Epoch {epoch} accuracy: {accuracy(model(x_valid),y_valid)}%, loss: {loss_func(model(x_valid), y_valid)}")