Using DeepLabv3

Tensorflow DeepLabv3 model

A specific version of the Tensorflow DeepLabv3 model has been tested: deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar. This version of DeepLabv3 uses MobileNet-v2 as the backbone and has been pretrained on the Pascal VOC 2012 dataset.

Download the model.

wget http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz

After downloading the model extract the contents to a directory.

tar xzvf deeplabv3_mnv2_pascal_train_aug_2018_01_29.tar.gz

Convert the model using the snpe-tensorflow-to-dlc converter.

snpe-tensorflow-to-dlc --input_network deeplabv3_mnv2_pascal_train_aug/frozen_inference_graph.pb --input_dim sub_7 1,513,513,3 --out_node ArgMax --output_path deeplabv3.dlc

The output layer for the model is:

  • ArgMax

The output buffer names is:

  • (Segmentation Map) ArgMax:0.raw

Preprocessing Input Images

Qualcomm® Neural Processing SDK does not support the preprocessing done within the DeepLabv3 model. Preprocessing must be done offline before the images are run. In the preprocessing phase, images must be resized to 513x513x3 and the pixels must be normalized to be between -1 to 1.

The following steps need to be performed on all input images in this exact order:

  1. Calculate the resize ratio and target size of the image using the following:

    resize_ratio = 513.0 / max(width, height)
    target_size = (int(resize_ratio * width), int(resize_ratio * height))
    
  2. Convert the image to the target_size, using an Anti-alias resampling filter. This will make the longer dimension of the image to be 513 and the other dimension will be smaller than 513.

  3. Pad the smaller dimension with the mean value of 128 to produce an image of 513x513x3.

  4. Convert the image to type float32.

  5. Multiply the image elementwise with 0.00784313771874.

  6. Elementwise subtract 1.0 from the image.

Running the model in Qualcomm® Neural Processing SDK

The following are limitations and suggestions for running DLC model in Qualcomm® Neural Processing SDK:

  • Some operations in the model are supported on CPU runtime processor only. To run the model using different runtime processor, such as GPU or DSP, CPU fallback mode must be enabled in Runtime List (see Snpe_SNPEBuilder_SetRuntimeProcessorOrder() description in Qualcomm® Neural Processing SDK API). If using snpe-net-run tool, use –runtime_order option

Postprocessing Output Segmentation Maps

Running DeepLabv3 with Qualcomm® Neural Processing SDK will produce an output segmentation map of size 513x513x1 where every element is an integer that represents a class (e.g. 0=background, etc.). However the output of Qualcomm® Neural Processing SDK still has the padding applied in the preprocessing step. This padding must be cropped out and the image should be resized to the orginal size.

The following steps should be taken in order to get the same dimensions as the original image:

  1. Crop off the padding that was applied to the shorter dimension in the pre-processing step. The ratio of the dimensions of the segmentation map should now be the same as the original image.

  2. Resize the segmentation map to the height and width of the original image.