BMNet Compiler

Introduction

The BMNET library is designed to convert the neural networks defined by CAFFE to target instructions. It seems like a compiler which translates high-level language into machine instruc- tions. It also contains three phase which are the front end, the optimizer and the back end. The front end parses source code, extracts network prototxt and weights. The optimizer is responsible for doing a broad variety of transformations to try to improve the code’s running time. The back end (also known as the code generator) then maps the code onto the target instruction set. In the BM1880 platform,we add a new feature called INT8 computation,it can provide better performance such as inference speedup. INT8 computation need an calibration table to modify network parameter,you can refer section 2 for how to generate a network’s calibration table. Refer this document,you can convert a network from FP32 to INT8 without significant accuracy loss.

General Description

We provide multiple utility tools to convert CAFFE models into machine instructions. These instructions, as well as model’s weights, would be packed into a file named bmodel (model file for BITMAIN targets), which can be executed in BITMAIN board directly. BMNet has implemented many common layers, the full list of build-in layers is in below table, and many more layers are in developing:

Activation

BatchNorm

Concat


Convolution

Eltwise

Flatten

InnerProduct

Join

LRN

Normalize

Permute

Pooling

PReLU

PriorBox

Reorg

Reshape

Scale

Split

Upsample

If layers of your network model are all supported in BMNet, it is very convenient to use command line to compile the network, otherwise you can refer to Chapter 3 to add customized layers.

Programing model

The BMNET library offers a set of API and tool to convert caffemodel into machine instructions, which is saved in bmodel file. The bmodel file also keeps more information of network model, such as network name, target name, shape, weight, etc.

Calibration Caffe Tool Guide

Introduction

● The tool would use caffe as inference framework and collect required statistics.

● Given a caffe prototxt and fp32 caffemodel ​can automatically generate both​ calibration table which contain calibration info and int8 caffemodel.

Support network list

resnet18|resnet50|alexnet_winograd|unet|custom_model|alexnet|yolov3|yolov2|mobilenet|vgg16|googlenet|resnet50_winograd|lenet|det1|densenet|det2|resnet101|SSD_300x300|det3|googlenet_v3_winograd|googlenet_v3|resnet152|mobilenet_v2

Get calibration caffe toolkit

You can download the toolkit from:

wget https://sophon-file.bitmain.com.cn/sophon-prod/drive/19/02/09/16/BM1880-Calibration-Tools.zip

and the files included in this toolkit as below, the CustomModel and the Resnet50 offered just for examples.

calibration-caffe-tools
├── BMNet_1880_api.pdf
├── build
│   └── calibration_math.so
├── calibration_caffe.bin
├── Calibration Tool Guide.pdf
├── CustomModel
│   ├── custom.caffemodel
│   ├── custom.prototxt
│   ├── input.txt
│   ├── test2.jpg
│   └── test.jpg
├── custom_model.json
└── Resnet50
    ├── deploy.prototxt
    ├── ILSVRC2012_val
    │   └── input.txt
    ├── resnet50_input_1_3_224_224.bin
    └── ResNet-50-model.caffemodel

Usage

  • Put the prototxt and fp32 caffemodel in the directory of output_path.

  • The calibration table and int8 caffemodel would generate in the

    ouput_path, the script as following:

./calibration_caffe.bin [​net_name​] [​output_path​] --iteration [iteration batch size]
  • The relative parameter is as following:

    ○ [net_name]: net name ○ [output_path]: output path of prototxt and int8 caffemodel. ○ [iteration batch size]: calibration batch size.

Input data layer

  • You can define the calibration data in the caffe prototxt. You can see

    example in the deploy.prototxt.

  • The relative parameter is as following:

    • [data_list]: The file describes where to find the images. Each line is the path to image.

    • [h]: input data height

    • [w]: input data width

    • [color_format]: Specify which color format you want to use. Only

      support RGB/BGR.

    • [r/g/b_mean]: RGB mean value. If > 0, all image will subtract the

      mean value.

    • [scale]: The scale value of the data. It will multiply to the data after minus the r/g/b mean value. The default value is 1

    • [mirror]: Specify the data needs to mirror or not. The default value is 0.

      • 0: no need to mirror

      • 1: vertical and horizontal

      • 2: vertical

      • 3: horizontal

    • [transpose]: Specify the data transpose axises. The default is [2,0,1] (equal to [c,h,w] order).

    • [padding]: If true, padding is added when resize to the target size to keep the original aspect ratio.The default is false.

    • [debug]: Save each data as specific format in “debug” folder in the working directory. Note that it is saved before doing transpose. The default is None.

      • npy: Save as numpy array

      • image: Save as image

Custom model support

If you want to calibrate your own model, you can define ‘custom_model.json’. The relative parameter is as following:

  • [name]: net name

  • [in_prototxt]: input caffe prototxt file

  • [in_caffemodel]: input caffe model

The out file will be ‘bmnet_​your_net_name​_calibration_table.pb2’ and ‘bmnet_​your_net_name_​int8.caffemodel’

Example

Enter calibration_tool folder while you get the toolkit from

wget https://sophon-file.bitmain.com.cn/sophon-prod/drive/19/02/09/16/BM1880-Calibration-Tools.zip
Resnet50/
├── deploy.prototxt
├── ILSVRC2012_val  //ImageNet dataset,you can download from www.image-net-org
└── ResNet-50-model.caffemodel

You can download the dataset ILSVRC2012_val here: http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar

Describe your jpeg dataset path in deploy.prototxt

name: "ResNet-50"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224

layer {
  name: 'input'
  type: 'Python'
  top: 'data'
  python_param {
    module: 'custom_data_layer.general_data_layer'
    layer: 'DataLayer'
    param_str: "{'data_list': ./Resnet50/ILSVRC2012_val/input.txt, 'color_format': rgb, 'h': 224, 'w': 224, 'r_mean': 123.68, 'g_mean': 116.779, 'b_mean': 103.939}"
  }
}
...
...

Calibration toolkit also support lmdb dataset, describe your lmdb-dataset path in deploy.prototxt

layer {
        name: "data"
        type: "Data"
        top: "data"
        top: "label"
        include {
                phase: TEST
        }
        transform_param {
                mirror: false
                crop_size: 224
        }
        data_param {
                source: "/path/to/imagenet/ilsvrc12_val_lmdb"
                batch_size: 1
                backend: LMDB
        }
}

and the input.txt describe the path of jpeg files, content as below:

./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000001.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000002.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000003.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000004.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000005.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000006.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000007.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000008.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000009.JPEG
./Resnet50/ILSVRC2012_val/ILSVRC2012_val_00000010.JPEG
...

Now, you can start the Resnet50 caffemodel calibration

./calibration_caffe.bin resnet50 ./resnet50 --iteration 100
WARNING: Logging before InitGoogleLogging() is written to STDERR
W1229 09:45:44.144830 12707 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W1229 09:45:44.144860 12707 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W1229 09:45:44.144862 12707 _caffe.cpp:142] Net('Resnet50/deploy.prototxt', 1, weights='Resnet50/ResNet-50-model.caffemodel')
I1229 09:45:44.147876 12707 layer_factory.hpp:77] Creating layer input
I1229 09:45:44.153451 12707 net.cpp:220] Creating Layer input
I1229 09:45:44.153475 12707 net.cpp:516] input -> data
I1229 09:45:44.154451 12707 net.cpp:258] Setting up input
I1229 09:45:44.154466 12707 net.cpp:265] Top shape: 1 3 224 224 (150528)
I1229 09:45:44.154469 12707 net.cpp:273] Memory required for data: 602112
I1229 09:45:44.154474 12707 layer_factory.hpp:77] Creating layer conv1
I1229 09:45:44.154484 12707 net.cpp:220] Creating Layer conv1
I1229 09:45:44.154486 12707 net.cpp:542] conv1 <- data
I1229 09:45:44.154491 12707 net.cpp:516] conv1 -> conv1
I1229 09:45:44.154544 12707 net.cpp:258] Setting up conv1
I1229 09:45:44.154551 12707 net.cpp:265] Top shape: 1 64 112 112 (802816)
I1229 09:45:44.154556 12707 net.cpp:273] Memory required for data: 3813376
I1229 09:45:44.154564 12707 layer_factory.hpp:77] Creating layer bn_conv1
I1229 09:45:44.154572 12707 net.cpp:220] Creating Layer bn_conv1
I1229 09:45:44.154575 12707 net.cpp:542] bn_conv1 <- conv1
I1229 09:45:44.154580 12707 net.cpp:503] bn_conv1 -> conv1 (in-place)
I1229 09:45:44.154624 12707 net.cpp:258] Setting up bn_conv1
I1229 09:45:44.154630 12707 net.cpp:265] Top shape: 1 64 112 112 (802816)
I1229 09:45:44.154633 12707 net.cpp:273] Memory required for data: 7024640
I1229 09:45:44.154642 12707 layer_factory.hpp:77] Creating layer scale_conv1
I1229 09:45:44.154649 12707 net.cpp:220] Creating Layer scale_conv1
I1229 09:45:44.154652 12707 net.cpp:542] scale_conv1 <- conv1
I1229 09:45:44.154656 12707 net.cpp:503] scale_conv1 -> conv1 (in-place)
I1229 09:45:44.154667 12707 scale_layer.cpp:52] num_axes=1
I1229 09:45:44.154672 12707 layer_factory.hpp:77] Creating layer scale_conv1
...
...
...
res5b 23.8178100586 0
res5c 32.5854530334 0
bn2a_branch1 9.26634883881 4
res2c_relu 10.6714229584 0
res4e 15.3665313721 0
res5a_relu 27.6742744446 0
bn5b_branch2c 8.03337287903 6
res4d_branch2a 7.00451898575 8
bn4f_branch2b 10.0731954575 5
bn2b_branch2a 7.84093761444 6
bn2b_branch2b 7.20697212219 6
bn2b_branch2c 10.2082700729 6
res5a_branch2c 1.2891266346 5
res5a_branch1 5.15547370911 6
res5a_branch2a 7.32843542099 7
res3d_branch2b 6.73829030991 7
res3d_branch2c 2.87677979469 6
res3d_branch2a 9.06925201416 7
res3d_branch2b_relu 10.6139621735 0
res2a_relu 15.9442462921 0
bn4e_branch2b 6.69408226013 6
scale5a_branch2a 9.30217647552 6
scale5a_branch2c 12.7850370407 6
scale5a_branch2b 9.97138404846 7
Time: 146.367562s
  • In the path of ​Resnet50/, you can see bmnet_​resnet50_​ calibration_table.1x10.pb2 and bmnet_​resnet50_​int8.1x10.caffemodel.

Resnet50/
├── bmnet_resnet50_calibration_table.1x10
├── bmnet_resnet50_calibration_table.1x10.pb2
├── bmnet_resnet50_calibration_table.1x10.prototxt
├── bmnet_resnet50_int8.1x10.caffemodel
├── deploy.prototxt
├── ILSVRC2012_val
└── ResNet-50-model.caffemodel

bm_builder.bin

Description

The bm_builder.bin combines frontend, optimizer and backend modules into one executable binary, and links to libbmnet.so. It takes network’s caffemodel and deploy.prototxt as inputs, and finally generates bmodel after compiled.

Get the bm_builder toolkit:

you need install the BMNNSDK usb mode on your ubuntu x86-64 host,how to install BMNSDK usb mode.

bm_builder.bin in path:

/opt/bmtap2/bm1880-usb_1.0.3.1/bin$ ll
-rwxr-xr-x  1 root root   60472 12 26 11:14 bm_builder.bin*

End-user Options

bm_builder.bin 
    -t or --target 
        Specify the name of BITMAIN target board (bm1880).
    -n or --name 
        Specify the name of deep learning network.
    -s or --shape 
        Specify the input shape of network. Dims should be separated by commas and no backspace is
allowed.
    -u or --plugin 
        Specify the directories of cpu op plugins.
    -c or --caffemodel 
        Specify the caffemodel generated by calibration_caffe.bin.
    -m or --modified_proto
 
        If you want to modify the prototxt, specify the modified deploy.prototxt of network.
    -o or --out_model 
        Specify the output bmodel file.
    -d or --in_ctable=input_ctable_file 

        Specify the calibration table generated by calibration_caffe.bin.
    -e or --out_ctable=output_ctable_file 
        Specify the output of optimizer calibration table.
    -p or --out_proto=output_prototxt_file 
        Specify the optimizer prototxt file of network.
    --enable -weight -optimize=yes|no [no]
        Specify the option(yes or no) to enable or disable optimization .
       

Example

cd calibation_tool/Resnet50
/opt/bmtap2/bm1880-usb_1.0.3.1/bin/bm_builder.bin 
    -t bm1880 
    -n resnet50 
    -c bmnet_resnet50_int8.1x10.caffemodel 
    --in_ctable=bmnet_resnet50_calibration_table.1x10.pb2 
    --out_ctable=bmnet_resnet50_calibration_opt_table.1x10.pb2 
    --enable-weight-optimize=yes 
    --enable-layer-group=yes 
    --fc-left-shift=6 
    -s 1,3,224,224 
    -p resnet50_frontend_opt.proto 
    -o resnet50.bmodel

please pay attention that parameter bm_builder.bin used above should be one line

resnet50.bmodel can be generated.

Resnet50/
├── bmnet_resnet50_calibration_opt_table.1x10.pb2
├── bmnet_resnet50_calibration_table.1x10
├── bmnet_resnet50_calibration_table.1x10.pb2
├── bmnet_resnet50_calibration_table.1x10.prototxt
├── bmnet_resnet50_int8.1x10.caffemodel
├── bmnet.s
├── deploy.prototxt
├── ILSVRC2012_val
├── resnet50.bmodel
├── resnet50_frontend_opt.proto
├── resnet50_input_1_3_224_224.bin
└── ResNet-50-model.caffemodel

bmodel testing

How to test the bmodel you can follow the instructions below If you use the Edge Development Board SoC mode,

on ubuntu x86-64 host pc:
git clone https://github.com/BM1880-BIRD/bm1880-ai-demo-program.git
//copy the bmtap2-bm1880-soc_basic_test folder to rootfs of BM1880 EDB,such as /system/
//copy resnet50.bmodel,resnet50_input_16_3_224_224.bin to rootfs of BM1880 EDB,such as /system/

on BM1880 shell:
cd /system/data
./load_driver.sh
ldconfig
cd /system/bmtap2-bm1880-soc_basic_test/bin/
./test_bmnet_bmodel resnet50_input_1_                                                     
3_224_224.bin resnet50.bmodel ouput.bin 1 3 224 224                                                                                  
<CMD> ./test_bmnet_bmodel resnet50_input_1_3_224_224.bin resnet50.bmodel ouput.bin 1 3 224 224                                       
Runtime Version : [1.0.3]                                                                                                            
input size: 150528                                                                                                                   
output size:1000                                                                                                                     
outputs[0]: [1,1,1,1000], "fc1000"                                                                                                   
test_bmnet_bmodel: load 112 us, run 15216 us, read 11 us                                                                             
WARNING: Logging before InitGoogleLogging() is written to STDERR                                                                     
I0101 08:02:39.781909   238 bm_device_soc.cpp:216] device[0] closed 

For NNS or usb mode of Edge Development Board, you can follow the instruction below:

/opt/bmtap2/bm1880-usb_1.0.3.1/bin/test_bmnet_bmodel resnet50_input_1_3_224_224.bin resnet50.bmodel ouput.bin 1 3 224 224
<CMD> /opt/bmtap2/bm1880-usb_1.0.3.1/bin/test_bmnet_bmodel res/resnet50_input_1_3_224_224.bin r50_batch_1.bmodel output.bin 1 3 224 224 
Runtime Version : [1.0.3]
input size: 150528
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0114 21:07:43.254546 15189 bm_firmware.cpp:76] check firmware status ...
I0114 21:07:43.255086 15189 bm_firmware.cpp:82] firmware loading ...
I0114 21:07:43.361970 15189 bm_firmware.cpp:47] firmware load success
I0114 21:07:49.373096 15189 bm_firmware.cpp:98] firmware is running
I0114 21:07:49.373186 15189 bm_device.cpp:104] device[0] opened,gmem_size : 0x40000000
output size:4000
outputs[0]: [1,1,1,1000], "prob_1"
 [Softmax run]: count: 1000 axis: 1 channels: 1000 outer_num: 1 inner_num: 1 threshold_x: 28.1843 threshold_y: 128
cpu_op: softmax_op done
test_bmnet_bmodel: load 116 us, run 15200 us, read 13 us  

Calibration ONNX Tool Guide

Description

Calibration tool for onnx model.

Get calibration ONNX tool

You can download the toolkit from:

https://sophon-file.bitmain.com.cn/sophon-prod/drive/19/00/21/11/BM1880_Calibration_Tools.zip

and the files included in this toolkit as below, the CustomModel and the Resnet50 offered just for examples.

calibration_onnx_tool/
├── bin
│   ├── bm_builder_onnx.bin
│   ├── onnx_calibration.bin
│   ├── run_resnet50_bmodel_armv8
│   └── run_resnet50_bmodel_x86
├── lib
│   ├── armv8
│   ├── libbmkernel.so
│   ├── libbmnet_caffe_pb.so
│   ├── libbmnet.so
│   ├── libbmodel.so
│   ├── libbmruntime.so
│   ├── libcaffe2.so
│   ├── libonnx_proto.so
│   ├── libonnx.so
│   └── x86_64
├── Readme.txt
├── res
│   ├── imagenet_classes.txt
│   ├── imagenet_partial
│   └── resnet50_input_1_3_224_224.bin
└── Resnet50
    ├── resnet50.onnx
    └── run_resnet50_bmodel.cpp

Usage

./bin/onnx_calibration.bin 
parse arg ret: 0
Invalide arguments
  onnx_calibration 
  -d dataset path
  -m onnxmodel file
  -o output model file
  -i iteration number
  -b batch size

Example

$ cd calibration_onnx_tool
$ export LD_LIBRARY_PATH=./lib/
$ bin/onnx_calibration.bin 
          -m ./Resnet50/resnet50.onnx 
          -d ./res/imagenet_partial/LSVRC-2012/ 
          -o r50.cal.onnx 
          -i 10 
          -b 1

onnxmodel ./Resnet50/resnet50.onnx
dataset ./res/imagenet_partial/LSVRC-2012/
output model r50.cal.onnx
iteration 10
input batch size 1
parse arg ret: 0
<CMD> bin/onnx_calibration.bin -m ./Resnet50/resnet50.onnx -d ./res/imagenet_partial/LSVRC-2012/ -o r50.cal.onnx -i 10 -b 1 
This version of onnx-caffe2 targets ONNX operator set version 7, but the model we are trying to import uses version 8.  We will try to import it anyway, but if the model uses operators which had BC-breaking changes in the intervening versions, import will fail.
This version of onnx-caffe2 targets ONNX operator set version 7, but the model we are trying to import uses version 8.  We will try to import it anyway, but if the model uses operators which had BC-breaking changes in the intervening versions, import will fail.
This version of onnx-caffe2 targets ONNX operator set version 7, but the model we are trying to import uses version 8.  We will try to import it anyway, but if the model uses operators which had BC-breaking changes in the intervening versions, import will fail.
This version of onnx-caffe2 targets ONNX operator set version 7, but the model we are trying to import uses version 8.  We will try to import it anyway, but if the model uses operators which had BC-breaking changes in the intervening versions, import will fail.
This version of onnx-caffe2 targets ONNX operator set version 7, but the model we are trying to import uses version 8.  We will try to import it anyway, but if the model uses operators which had BC-breaking changes in the intervening versions, import will fail.
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0115 11:41:41.498390 21843 init.h:99] Caffe2 GlobalInit should be run before any other API calls.
W0115 11:41:41.500347 21843 init.h:99] Caffe2 GlobalInit should be run before any other API calls.
I0115 11:41:41.803261 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.803356 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.804275 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.804330 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.805279 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.805331 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.806298 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.806352 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.807260 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.807310 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.808189 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.808254 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.809103 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.809151 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.809998 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.810055 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.810925 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.810973 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256
I0115 11:41:41.811820 21843 Calibration_legacy.cpp:111] pInputDims NCHW: 1 3 224 224
I0115 11:41:41.811872 21843 Calibration_legacy.cpp:119] datum CHW: 3 256 256

...

layer {
  name: "res5c_1"
  blob_param {
    name: "res5c_1"
    threshold_y: 33.7025
  }
}
layer {
  name: "res5c_relu_1"
  blob_param {
    name: "res5c_relu_1"
    threshold_y: 33.7025
  }
}
layer {
  name: "pool5_1"
  blob_param {
    name: "pool5_1"
    threshold_y: 14.885545
  }
}
layer {
  name: "OC2_DUMMY_106"
  blob_param {
    name: "OC2_DUMMY_106"
    threshold_y: 14.885545
  }
  blob_param {
    name: "OC2_DUMMY_1"
    threshold_y: 0
  }
}
layer {
  name: "fc1000_1"
  blob_param {
    name: "fc1000_1"
    threshold_y: 28.184326
  }
}
layer {
  name: "prob_1"
  blob_param {
    name: "prob_1"
    threshold_y: 0.062530234
  }

You can see the the file r50.cal.onnx be generated.

./
├── bin
│   ├── bm_builder_onnx.bin
│   ├── onnx_calibration.bin
│   ├── run_resnet50_bmodel_armv8
│   └── run_resnet50_bmodel_x86
├── lib
│   ├── armv8
│   ├── libbmkernel.so
│   ├── libbmnet_caffe_pb.so
│   ├── libbmnet.so
│   ├── libbmodel.so
│   ├── libbmruntime.so
│   ├── libcaffe2.so
│   ├── libonnx_proto.so
│   ├── libonnx.so
│   └── x86_64
├── r50.cal.onnx
├── Readme.txt
├── res
│   ├── imagenet_classes.txt
│   ├── imagenet_partial
│   └── resnet50_input_1_3_224_224.bin
└── Resnet50
    ├── resnet50.onnx
    └── run_resnet50_bmodel.cpp

bm_build_onnx.bin

Description

The bm_builder_onnx.bin combines frontend, optimizer and backend modulesinto one executable binary, and links to libbmnet.so. It takes network’scalibrated .onnx model as inputs, and finally generates bmodel after compiled.

End-User Options

bm_builder_onnx.bin [options]

    -enable-less-loss-fc          -enable less-loss FC
    -fc-left-shift=<int>          - FC left-shift for partial sum
    -force-less-loss-fc           - force less-loss FC
    
    -dump-all-neuron              -
    -dump-layer-group-info=<file> -Specify log location
    -enable-layer-group            -
        =yes                       -enable
        =no                        -disable
    -enable-softmax
        =yes                       -enable
        =no                        -disable
    -ignore-bank-conflict            -
    -layer-group-fix                - fix layer group for more pass
    -layer-group-sese                - use Single-Entry-Single-Exit tiling mode
    -name=<string>                -name of network
    -onnx=<file>                   -input onnx model
    -out-cmdbuf=<file>            -out cmdbuf file
    -out-model=<file>             -output bmodel file
    -out-proto=<file>             -output prototxt file
    -plugin=<string>              -path of cpu op plugin
    -quantized-onnx-model=<string> -the path for the quantized-onnx-model
    -shape=<string>               -n,c,h,w, give 0 mean using shape info in model, 
                                    example : -s 2, 0, 0, 0
    -target                        - Specify your target
        =bm1880                    - BM1880
        
    -help                          - Display available options (-help-hidden for more)
    -help-list                    - Display list of available options (-help-list-hidden for more)
    -version                       - Display the version of this program
    
    

Example

./bin/bm_builder_onnx.bin -t bm1880 
                          -n r50 
                          -c r50.cal.onnx 
                          -s 1,0,0,0 
                          --enable-layer-group=yes 
                          -p r50_bmnet_opt_from_onnx.proto 
                          -o r50_batch_1.bmodel 
                          -u ./lib/
                          
<CMD> ./bin/bm_builder_onnx.bin -t bm1880 -n r50 -c r50.cal.onnx -s 1,0,0,0 --enable-layer-group=yes -p r50_bmnet_opt_from_onnx.proto -o r50_batch_1.bmodel -u ./lib/ 
target bm1880
W0115 13:52:28.163966 28228 OnnxFrontendContext.cpp:112] 
W0115 13:52:28.164011 28228 OnnxFrontendContext.cpp:113] OnnxFrontendContext: converting to bmnet model
W0115 13:52:30.412812 28228 BM188xBackendContext.cpp:114] 
W0115 13:52:30.412829 28228 BM188xBackendContext.cpp:115] BM188xBackendContext: building cmdbuf
W0115 13:52:30.420598 28228 conv_parallel_bmkernel.cpp:164] [ConvFixedParallel::split], coeff too large: 65536, not supported
W0115 13:52:30.420828 28228 conv_parallel_bmkernel.cpp:164] [ConvFixedParallel::split], coeff too large: 73728, not supported
W0115 13:52:30.421167 28228 conv_parallel_bmkernel.cpp:164] [ConvFixedParallel::split], coeff too large: 73728, not supported
W0115 13:52:30.421496 28228 conv_parallel_bmkernel.cpp:164] [ConvFixedParallel::split], coeff too large: 73728, not supported
frontend_out_ctable_file.pb2 fc1000_1
frontend_out_ctable_file.pb2 data_0

You can see the r50_batch_1.bmodel generated:

./
├── bin
│   ├── bm_builder_onnx.bin
│   ├── onnx_calibration.bin
│   ├── run_resnet50_bmodel_armv8
│   └── run_resnet50_bmodel_x86
├── bmnet.s
├── frontend_out_ctable_file.pb2
├── frontend_out_ctable_file.prototxt
├── lib
│   ├── armv8
│   ├── libbmkernel.so
│   ├── libbmnet_caffe_pb.so
│   ├── libbmnet.so
│   ├── libbmodel.so
│   ├── libbmruntime.so
│   ├── libcaffe2.so
│   ├── libonnx_proto.so
│   ├── libonnx.so
│   └── x86_64
├── optimizer_out_ctable_file.pb2
├── output.bin
├── r50_batch_1.bmodel
├── r50_bmnet_opt_from_onnx.proto
├── r50.cal.onnx
├── Readme.txt
├── res
│   ├── imagenet_classes.txt
│   ├── imagenet_partial
│   └── resnet50_input_1_3_224_224.bin
└── Resnet50
    ├── resnet50.onnx
    └── run_resnet50_bmodel.cpp

7 directories, 25 files

bmodel testing

The testing method is same as the method that caffe bmodel used, you can visit the section above.

Last updated