This example shows how to use the codegen command to generate code for an image classification application that uses deep learning on Intel® processors. The generated code uses the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN). This example consists of two parts:
The first part shows how to generate a MEX function that accepts a batch of images as input.
The second part shows how to generate an executable that accepts a batch of images as input.
Intel processor with support for Intel Advanced Vector Extensions 2 (Intel AVX2) instructions
Intel Math Kernel Library for Deep Neural Networks (MKL-DNN)
Environment variables for the compilers and libraries. For information on the supported versions of compilers, see Supported Compilers. For setting up the environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
This example is supported on Linux® , Windows® and Mac® platforms and not supported for MATLAB Online.
Download a sample video file.
if ~exist('./object_class.avi', 'file')
url = 'https://www.mathworks.com/supportfiles/gpucoder/media/object_class.avi.zip';
websave('object_class.avi.zip',url);
unzip('object_class.avi.zip');
endresnet_predict FunctionThis example uses the DAG network ResNet-50 to show image classification on Intel desktops. A pretrained ResNet-50 model for MATLAB is available as part of the support package Deep Learning Toolbox Model for ResNet-50 Network.
The resnet_predict function loads the ResNet-50 network into a persistent network object and then performs prediction on the input. Subsequent calls to the function reuse the persistent network object.
type resnet_predict
% Copyright 2020 The MathWorks, Inc.
function out = resnet_predict(in)
%#codegen
% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused
% to call predict on inputs, avoiding reconstructing and reloading the
% network object.
persistent mynet;
if isempty(mynet)
% Call the function resnet50 that returns a DAG network
% for ResNet-50 model.
mynet = coder.loadDeepLearningNetwork('resnet50','resnet');
end
% pass in input
out = mynet.predict(in);
resnet_predictTo generate a MEX function for the resnet_predict function, use codegen with a deep learning configuration object for the MKL-DNN library. Attach the deep learning configuration object to the MEX code generation configuration object that you pass to codegen. Run the codegen command and specify the input as a 4D matrix of size [224,224,3,|batchSize|]. This value corresponds to the input layer size of the ResNet-50 network.
batchSize = 5;
cfg = coder.config('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('mkldnn');
codegen -config cfg resnet_predict -args {ones(224,224,3,batchSize,'single')} -report
Code generation successful: To view the report, open('codegen\mex\resnet_predict\html\report.mldatx').
Presuming the Object_class.avi video file is already downloaded. Create the videoReader object and read five frames using videoReader read function.Since batchSize is set to 5 read 5 images .Resize the batch of input images to size needed by resnet50 size expected by ResNet50 network.
videoReader = VideoReader('Object_class.avi');
imBatch = read(videoReader,[1 5]);
imBatch = imresize(imBatch, [224,224]);Call the generated resnet_predict_mex function which outputs classification results for the inputs that you provide.
predict_scores = resnet_predict_mex(single(imBatch));
Get top 5 probability scores and their labels for each image in the batch.
[val,indx] = sort(transpose(predict_scores), 'descend');
scores = val(1:5,:)*100;
net = resnet50;
classnames = net.Layers(end).ClassNames;
for i = 1:batchSize
labels = classnames(indx(1:5,i));
disp(['Top 5 predictions on image, ', num2str(i)]);
for j=1:5
disp([labels{j},' ',num2str(scores(j,i), '%2.2f'),'%'])
end
endFor predictions on the first image, map the top five prediction scores to words in the synset dictionary.
fid = fopen('synsetWords.txt');
synsetOut = textscan(fid,'%s', 'delimiter', '\n');
synsetOut = synsetOut{1};
fclose(fid);
[val,indx] = sort(transpose(predict_scores), 'descend');
scores = val(1:5,1)*100;
top5labels = synsetOut(indx(1:5,1));Display the top five classification labels on the image.
outputImage = zeros(224,400,3, 'uint8');
for k = 1:3
outputImage(:,177:end,k) = imBatch(:,:,k,1);
end scol = 1;
srow = 1;
outputImage = insertText(outputImage, [scol, srow], 'Classification with ResNet-50', 'TextColor', 'w','FontSize',20, 'BoxColor', 'black');
srow = srow + 30;
for k = 1:5
outputImage = insertText(outputImage, [scol, srow], [top5labels{k},' ',num2str(scores(k), '%2.2f'),'%'], 'TextColor', 'w','FontSize',15, 'BoxColor', 'black');
srow = srow + 25;
endimshow(outputImage);

Clear the persistent network object from memory.
clear mex;
resnet_predict_exe Entry-Point FunctionTo generate an executable from MATLAB code, define a new entry-point function resnet_predict_exe. This function is similar to the previous entry-point function resent_predict but, in addition, includes code for preprocessing and postprocessing. The API that resnet_predict_exe uses is platform independent. This function accepts a video and the batch size as input arguments. These arguments are compile-time constants.
type resnet_predict_exe
% Copyright 2020 The MathWorks, Inc.
function resnet_predict_exe(inputVideo,batchSize)
%#codegen
% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused
% to call predict on inputs, avoiding reconstructing and reloading the
% network object.
persistent mynet;
if isempty(mynet)
% Call the function resnet50 that returns a DAG network
% for ResNet-50 model.
mynet = coder.loadDeepLearningNetwork('resnet50','resnet');
end
% Create video reader and video player objects %
videoReader = VideoReader(inputVideo);
depVideoPlayer = vision.DeployableVideoPlayer;
% Read the classification label names %
synsetOut = readImageClassLabels('synsetWords.txt');
i=1;
% Read frames until end of video file %
while ~(i+batchSize > (videoReader.NumFrames+1))
% Read and resize batch of frames as specified by input argument%
reSizedImagesBatch = readImageInputBatch(videoReader,batchSize,i);
% run predict on resized input images %
predict_scores = mynet.predict(reSizedImagesBatch);
% overlay the prediction scores on images and display %
overlayResultsOnImages(predict_scores,synsetOut,reSizedImagesBatch,batchSize,depVideoPlayer)
i = i+ batchSize;
end
release(depVideoPlayer);
end
function synsetOut = readImageClassLabels(classLabelsFile)
% Read the classification label names from the file
%
% Inputs :
% classLabelsFile - supplied by user
%
% Outputs :
% synsetOut - cell array filled with 1000 image class labels
synsetOut = cell(1000,1);
fid = fopen(classLabelsFile);
for i = 1:1000
synsetOut{i} = fgetl(fid);
end
fclose(fid);
end
function reSizedImagesBatch = readImageInputBatch(videoReader,batchSize,i)
% Read and resize batch of frames as specified by input argument%
%
% Inputs :
% videoReader - Object used for reading the images from video file
% batchSize - Number of images in batch to process. Supplied by user
% i - index to track frames read from video file
%
% Outputs :
% reSizedImagesBatch - Batch of images resized to 224x224x3xbatchsize
img = read(videoReader,[i (i+batchSize-1)]);
reSizedImagesBatch = coder.nullcopy(ones(224,224,3,batchSize,'like',img));
resizeTo = coder.const([224,224]);
reSizedImagesBatch(:,:,:,:) = imresize(img,resizeTo);
end
function overlayResultsOnImages(predict_scores,synsetOut,reSizedImagesBatch,batchSize,depVideoPlayer)
% Read and resize batch of frames as specified by input argument%
%
% Inputs :
% predict_scores - classification results for given network
% synsetOut - cell array filled with 1000 image class labels
% reSizedImagesBatch - Batch of images resized to 224x224x3xbatchsize
% batchSize - Number of images in batch to process. Supplied by user
% depVideoPlayer - Object for displaying results
%
% Outputs :
% Predicted results overlayed on input images
% sort the predicted scores %
[val,indx] = sort(transpose(predict_scores), 'descend');
for j = 1:batchSize
scores = val(1:5,j)*100;
outputImage = zeros(224,400,3, 'uint8');
for k = 1:3
outputImage(:,177:end,k) = reSizedImagesBatch(:,:,k,j);
end
% Overlay the results on image %
scol = 1;
srow = 1;
outputImage = insertText(outputImage, [scol, srow], 'Classification with ResNet-50', 'TextColor', [255 255 255],'FontSize',20, 'BoxColor', [0 0 0]);
srow = srow + 30;
for k = 1:5
scoreStr = sprintf('%2.2f',scores(k));
outputImage = insertText(outputImage, [scol, srow], [synsetOut{indx(k,j)},' ',scoreStr,'%'], 'TextColor', [255 255 255],'FontSize',15, 'BoxColor', [0 0 0]);
srow = srow + 25;
end
depVideoPlayer(outputImage);
end
end
resnet_predict_exe FunctionThe function resnet_predict_exe contains four subsections that perform these actions:
Read the classification labels from supplied input text file
Read the input batch of images and resize them as needed by the network
Run inference on input image batch
Overlay the results on the images
For more information each of these steps, see the subsequent sections.
readImageClassLabels FunctionThis function accepts the synsetWords.txt file as an input argument. It reads the classification labels and populates a cell array.
function synsetOut = readImageClassLabels(classLabelsFile)
% Read the classification label names from the file
%
% Inputs :
% classLabelsFile - supplied by user
%
% Outputs :
% synsetOut - cell array filled with 1000 image class labels synsetOut = cell(1000,1);
fid = fopen(classLabelsFile);
for i = 1:1000
synsetOut{i} = fgetl(fid);
end
fclose(fid);
endreadImageInputBatch FunctionThis function reads and resizes the images from the video input file that is passed to the function as an input argument. It reads the specified input images and resizes them to 224x224x3 which is the size the resnet50 network expects.
function reSizedImagesBatch = readImageInputBatch(videoReader,batchSize,i)
% Read and resize batch of frames as specified by input argument%
%
% Inputs :
% videoReader - Object used for reading the images from video file
% batchSize - Number of images in batch to process. Supplied by user
% i - index to track frames read from video file
%
% Outputs :
% reSizedImagesBatch - Batch of images resized to 224x224x3xbatchsize img = read(videoReader,[i (i+batchSize-1)]);
reSizedImagesBatch = coder.nullcopy(ones(224,224,3,batchSize,'like',img));
resizeTo = coder.const([224,224]);
reSizedImagesBatch(:,:,:,:) = imresize(img,resizeTo);
endmynet.predict FunctionThis function accepts the resized batch of images as input and returns the prediction results.
% run predict on resized input images %
predict_scores = mynet.predict(reSizedImagesBatch);overlayResultsOnImages FunctionThis function accepts the prediction results and sorts them in descending order. It overlays these results on the input images and displays them.
function overlayResultsOnImages(predict_scores,synsetOut,reSizedImagesBatch,batchSize,depVideoPlayer)
% Read and resize batch of frames as specified by input argument%
%
% Inputs :
% predict_scores - classification results for given network
% synsetOut - cell array filled with 1000 image class labels
% reSizedImagesBatch - Batch of images resized to 224x224x3xbatchsize
% batchSize - Number of images in batch to process. Supplied by user
% depVideoPlayer - Object for displaying results
%
% Outputs :
% Predicted results overlayed on input images % sort the predicted scores %
[val,indx] = sort(transpose(predict_scores), 'descend'); for j = 1:batchSize
scores = val(1:5,j)*100;
outputImage = zeros(224,400,3, 'uint8');
for k = 1:3
outputImage(:,177:end,k) = reSizedImagesBatch(:,:,k,j);
end % Overlay the results on image %
scol = 1;
srow = 1;
outputImage = insertText(outputImage, [scol, srow], 'Classification with ResNet-50', 'TextColor', [255 255 255],'FontSize',20, 'BoxColor', [0 0 0]);
srow = srow + 30;
for k = 1:5
scoreStr = sprintf('%2.2f',scores(k));
outputImage = insertText(outputImage, [scol, srow], [synsetOut{indx(k,j)},' ',scoreStr,'%'], 'TextColor', [255 255 255],'FontSize',15, 'BoxColor', [0 0 0]);
srow = srow + 25;
end depVideoPlayer(outputImage);
end
endCreate a code configuration object for generating an executable. Attach a deep learning configuration object to it. Set the batchSize and inputVideoFile variables.
If you do not intend to create a custom C++ main function and use the generated example C++ main instead, set the GenerateExampleMain parameter to 'GenerateCodeAndCompile'. Also, disable cfg.EnableOpenMP to make sure there are no openmp library dependencies when you run your executable from the desktop terminal.
cfg = coder.config('exe');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('mkldnn');
batchSize = 5;
inputVideoFile = 'object_class.avi';
cfg.GenerateExampleMain = 'GenerateCodeAndCompile';
cfg.EnableOpenMP = 0;Run the codegen command to build the executable. Run the generated executable resnet_predict_exe either at the MATLAB command line or at the desktop terminal.
codegen -config cfg resnet_predict_exe -args {coder.Constant(inputVideoFile), coder.Constant(batchSize)} -report
system('./resnet_predict_exe')
codegen | coder.DeepLearningConfig | coder.loadDeepLearningNetwork | coder.MklDNNConfig