Hole in the Wall — computer vision-based method for defect detection

6 min readJun 7, 2021

Foreword

Leveraging AI to improve work efficiency becomes one of manufacturing 4.0’s deliverables as we start to explore more ways to advance automation, control manufacturing processes remotely and enable a remote workforce. In the last several years, visual inspection technologies have been helping manufacturers to ease the operation pressures. computer vision is being used in quality inspection for electronics part, retail goods and even food products.

Project Objectives

After completing NYP specialist diploma in business analytics and big data, I developed strong interest in deep learning. Recently I have taken my first step in Computer Vision using one of most popular deep learning frameworks PyTorch. In this personal project, I would like to demo the potential applications of computer vision by developing a ANN (artificial neural network) model to detect the potential concrete cracks which play a major role in the building inspection to determine the building health.

Work Accomplished

Data Preparation

I will be using a dataset that contains images of various concrete surfaces with and without crack from www.kaggle.com. The image data are organized by folders with one folder for each class — into Negative (without crack) and Positive (with crack) for image classification. Later, the dataset will be train-test 80/20 split using the SubsetRandomSampler into a training and testing set before training ML model. The code also converts the image into Pytorch Tensor. A tensor is a container which can house data in N dimensions.

train_dir = 'data/train'#Converts PIL Image in the range [0, 255] to a torch.FloatTensor of #shape (C x H x W) in the range [0.0, 1.0]train_data = datasets.ImageFolder(train_dir,transform=transforms.ToTensor())#Split into 80/20 train and test datasettrain_test_split = 0.2
count_train = len(train_data)
indexes = list(range(count_train))
split = int(np.floor(train_test_split * count_train))# SubsetRandomSampler takes as input the indices of data, then pass #the samplers to our dataloadernp.random.seed(1337)
np.random.shuffle(indexes)   
train_index, test_index = indexes[split:], indexes[:split]
train_sampler = SubsetRandomSampler(train_index)
test_sampler = SubsetRandomSampler(test_index)trainloader = torch.utils.data.DataLoader(train_data, sampler=train_sampler, batch_size=32)
testloader = torch.utils.data.DataLoader(train_data, sampler=test_sampler, batch_size=32)

Modelling

I am going to use a machine learning method called Transfer Learning where a model developed for a task is reused as the starting point for a model on a second task. I load a pretrained model ResNet 50. ResNet-50 is a convolutional neural network that is 50 layers deep. The network trained on more than a million images from the ImageNet database. I will change the final output layer of ResNet50 Model, including defining activation functions, loss function and optimizer. My output final layers will train with concrete surface images for my own image classification modelling problem.

#check for the GPU availability and load a pretrained model.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = models.resnet50(pretrained=True)#freeze the pre-trained layers, so don’t backprop through them during trainingfor param in model.parameters():
    param.requires_grad = False#Change the final layers of ResNet50 Model for Transfer Learningmodel.fc = nn.Sequential(nn.Linear(2048, 512), 
                                 nn.ReLU(), 
                                 nn.Dropout(0.2), 
                                 nn.Linear(512, 10), 
                                 nn.LogSoftmax(dim=1))#create the criterion (the loss function) 
#and pick an optimizer (Adam in this case) and learning rate.criterion = nn.NLLLoss()
optimizer = optim.Adam(model.fc.parameters(), lr=0.003)
model.to(device)

When train the model, the batches of images are fed as input to the network by calling the model( ) function, then, compute the loss function, and use the optimizer to apply gradient descent in back-propagation to finally get the minimum loss or error. Within the same piece of code below, while training is running, the ML model is tested every 10 batches.

The code also calculates the losses and accuracy as well. The model has achieved the low losses and good accuracy.

#load the batches of images and do the feed forward loop. 
#Then calculate the loss function, and use the optimizer to apply #gradient descent in back-propagation.epochs = 1
steps = 0
running_loss = 0
print_every = 10
train_losses, test_losses = [], []for epoch in range(epochs):
    for inputs, labels in trainloader:
        steps += 1
        
        # get the inputs
        inputs, labels = inputs.to(device), labels.to(device)
        
        # zero the parameter gradients
        optimizer.zero_grad()
        
        #feed input to the network
        logps = model(inputs)
        
        # measure value between n elements in the input and output
        loss = criterion(logps, labels)
        
        #compute gradient of loss all the parameters
        #store them in parameter.grad attribute for every parameter.
        loss.backward()
        
        #update parameters to finally get the minimum loss(error).
        optimizer.step()
        
        # print statistics
        running_loss += loss.item()
        if steps % print_every == 0:
            test_loss = 0
            accuracy = 0# switch to eval/test mode
            model.eval()
            
            # Turn off gradients to speed up this part
            with torch.no_grad():
                for inputs, labels in testloader:
                    inputs, labels = inputs.to(device),labels.to(device)
                    logps = model(inputs)
                    batch_loss = criterion(logps, labels)
                    test_loss += batch_loss.item()
                    
                    ps = torch.exp(logps)
                    top_p, top_class = ps.topk(1, dim=1)
                    equals = top_class == labels.view(*top_class.shape)
                    accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
            train_losses.append(running_loss/len(trainloader))
            test_losses.append(test_loss/len(testloader))                    
            print("Train loss: " + str(round(running_loss/print_every,3)) + '..' + "Test loss: " + str(round(test_loss/len(testloader),3)) + '..' + "Test accuracy: " + str(round(accuracy/len(testloader),3)))
            
            running_loss = 0
            
            #turn back to training mode after eval step:
            model.train()

Once training is done and the model named “cracksmodel.pth” is saved for later predictions.

torch.save(model, 'cracksmodel.pth')

Evaluation

Now I am going to introduce a new set of images which are unseen by the trained ML model (‘cracksmodel.pth’). The user defined function “get_random_images” will picks 10 random images from data/eval folder, then “predict_image” transforms each image and use the trained model to makes a prediction.

The results are looking good. Here’s one example of such predictions on 10 concrete surface images.

def predict_image(image):
    test_transforms = transforms.Compose([transforms.ToTensor(),])
    image_tensor = test_transforms(image).float()
    image_tensor = image_tensor.unsqueeze_(0)
    input = Variable(image_tensor)
    input = input.to(device)
    output = model(input)
    index = output.data.cpu().numpy().argmax()
    return indexdef get_random_images(num):val_indexes = list(range(len(val_data)))
    np.random.shuffle(val_indexes)
    val_idx = val_indexes[:num]
    from torch.utils.data.sampler import SubsetRandomSampler
    sampler = SubsetRandomSampler(val_idx)
    loader = torch.utils.data.DataLoader(val_data,sampler=sampler, batch_size=num)
    dataiter = iter(loader)
    images, labels = dataiter.next()
    return images, labelsfrom torch.autograd import Variableval_dir = 'data/eval'
val_data = datasets.ImageFolder(val_dir, transform=transforms.ToTensor())
classes = val_data.classes    
to_pil = transforms.ToPILImage()images, labels = get_random_images(10)
fig=plt.figure(figsize=(20,10))
for i in range(len(images)):
    image = to_pil(images[i])
    index = predict_image(image)
    sub = fig.add_subplot(1, len(images), i+1)
    res = int(labels[i]) == index
    sub.set_title(str(classes[index]) + ":" + str(res))
    plt.axis('off')
    plt.imshow(image)
plt.show()

Conclusion

Computer vision can inspect, analyze thousands of products and identify subtle defects in minutes. It can quickly surpass human capabilities. However, It is not an easy task to implement the computer vision technology. Setting up and deploying AI for inspection not only needs a reliable IT infra, but there is usually a limited amount of sample data to make training models, for example, I have used 2000 images to train a model for a concrete crack detection problem. In addition, training ML models can be challenging and require a specialized skill set.