There are two ways to program with runtime library:
BMNet
BMKernel.
Programming by BMNet
We provide multiple utility tools to convert CAFFE models into machine instructions. These instructions, as well as model’s weights, would be packed into a file named bmodel (model file for BITMAIN targets), which can be executed in BITMAIN board directly. BMNet has implemented many common layers, the full list of build-in layers is in below table, and many more layers are in developing:
Activation
BatchNorm
Concat
Convolution
Eltwise
Flatten
InnerProduct
Join
LRN
Normalize
Permute
Pooling
PReLU
PriorBox
Reorg
Reshape
Scale
Split
Upsample
Programming flow as follow :
BMNet takes CAFFE framework generated caffemodel and deploy file deploy.prototxt as input. After processing in stages such as front end, optimizer and back end, bmodel file can be generated.
If layers of your network model are all supported in BMNet, it is very convenient to use command line to compile the network, otherwise you can refer to BMKernel model.
Programming by BMKernel
If programming by kernel, then call bmruntime_bmkernel_create() function to create a BMkernel. After BMkernel is created, applications can use BMkernel interfaces to generate kernel commands, and then submit the commands by bmruntime_bmkernel_submit(). At last, bmruntime_bmkernel_destroy() should be called to release the kernel resources. Programming flow chart as follow :
BMNET provides a serials API to add customized layers without modifying the BMNet core code. Customized layer could be a pure new layer or could be instead of original caffe layer in bmnet. Below tutorial will guide through the steps to create a simple custom layer (use LeakyRelu layer as an example, source code could be found in bmnet/example/customized_layer_1880/) instead of original caffe layer in BMNet.
Add new caffe layer definition
Modify the bmnet_caffe.proto in path “bmnet/examples/customized_layer_1880/proto”. Firstly, you need to check whether the layer exist or not. If it exists skip this step, if it doesn’t exist please append a new line at the end of LayerParameter with a new index and add definition of LayerParameter.
Note: new layer must be added at the end line, for example, add a ReLUParameter.
message LayerParameter {optional string name =1; // the layer nameoptional string type =2; // the layer typerepeated string bottom =3; // the name of each bottom blobrepeated string top =4; // the name of each top blob...optional AccuracyParameter accuracy_param =102;optional EmbedParameter embed_param =137;optional ExpParameter exp_param =111;optional ReLUParameter relu_param =123; }
Create a child class that inherited from CustomizedCaffeLayer, and implement layer_name(), dump(), codegen() member methods :
layer_name(): needs to return the string name of layer type.
setup(): option. Only support to set set_sub_type if necessay. if not implement set_sub_type = layer type.
dump(): dump the parameter’s details of new added CAFFE layer in this function.
codegen(): convert parameters of CAFFE layer to tg_customized_param, which is param- eter of customized IR.
#include <bmnet/frontend/caffe/CaffeFrontendContext.hpp> #include <bmnet/frontend/caffe/CustomizedCaffeLayer.hpp>classLeakyReluLayer:publicCustomizedCaffeLayer { public:LeakyReluLayer () : CustomizedCaffeLayer() {} // return type name of new added CAFFE layer. std::string layer_name() {return std::string("ReLU"); }// dump parameters of CAFFE layer object layer_.voiddump () {const caffe::ReLUParameter &in_param =layer_.relu_param(); float negative_slope =in_param.negative_slope(); std::cout <<"negative_slope:"<< negative_slope;}voidsetup(TensorOp* op) { CustomizedCaffeLayer::setup(op); TGCustomizedParameter* param =op->mutable_tg_customized_param(); param->set_sub_type("leakyrelu");// convert parameters of CAFFE layer to customized // IR(TensorOp *op)'s parameter(tg_customized_param)voidcodegen(TensorOp *op) { // get input shapeconst TensorShape & input_shape =op->input_shape(0); // get parameter from caffe protoconst caffe:: ReLUParameter&in_param =layer_.relu_param();float negative_slope =in_param.negative_slope(); // set normal output shape TensorShape *output_shape =op->add_output_shape();output_shape ->CopyFrom(input_shape); // set out_param TGCustomizedParameter* out_param = op->mutable_tg_customized_param(); out_param ->add_f32_param(negative_slope); }};
Add new Tensor Instruction class
Create a child class that inherited from CustomizedTensorFixedInst, a class to convert IR to instructions, and implement inst_name(), dump(), encode() member functions:
inst_name(): needs to return IR name, lowercase with prefix “tg” + subtype, sub_type is set at 6.2.
dump(): dump tg_customized_param’s details of IR op.
encode(): convert IR to instructions. If the IR could be deployed to NPU,
please use BMKernel api to implement it, or you can just implement a pure CPU version used c++ language.
#include<bmnet/targets/plat-bm188x/BM188xBackendContext.hpp>#include<bmnet/targets/plat-bm188x/CustomizedTensorFixedInst.hpp>#include<bmkernel/bm_kernel.h>namespace bmnet {classTGLeakyReluFixedInst:publicCustomizedTensorFixedInst { public:TGLeakyReluFixedInst() :CustomizedTensorFixedInst() {} ~TGLeakyReluFixedInst () {} // return type name of IR std::stringinst_name() {return std::string("tg_leakyrelu"); } // dump tg_customized_param of IR op_.voiddump () {const TGCustomizedParameter& param =op_.tg_customized_param(); float alpha =param.f32_param(0); std::cout <<"alpha:"<< alpha << std::endl; } //extract parameters of tg_customized_param , // and implement instructions. voidencode();private:voidforward(gaddr_t bottom_gaddr ,gaddr_t top_gaddr ,int input_n ,int input_c ,int input_h ,int input_w);};}
NPU Version
If the IR could be deployed to NPU, please use BMKernel APIs to implement the function encode(). More details about BMKernel APIs, please refer to related document
Navigate to the cpu_op folder, and create a new cpp source file, the name of which should be same as type name of customized layer. In the file, you need to create a child class that inherited from CpuOp, and implement run() member method with c++ code. Finally, please register the new class with REGISTER_CPU_OP().
In order to compile the new added source file, please add it the CMakeLists.txt in the same folder.
Programming application
Introduction to development environment
We provide a docker development image for users, it includes tools and dependent libraries that required for BMNNSDK application development, and users can use it to develop the BMNNSDK application.
BMNNSDK Docker development image: bmtap2-dev_latest.docker(Note: the docker development image in this section is different from the docker deployment image in the previous chapter.)
Docker development image does not contain the BMNNSDK, please import the BMNNSDK to Docker development image for development before you use it.
Use the development environment
Please make sure you have installed the BMNNSDK before you use the docker development environment, and then import it to the docker development environment.
user@:/workspace$ exit
$ ls examples/bmnet_inferecne/bmnet_inference
$ ls examples/tensor_scalar/tensor_scalar
Deploy the code to the deployment environment, and run it. For USB mode, you can deploy it to a PC installed with the BM1880 development board. For SoC mode, you can deploy it to the BM1880 SoC development board via SD card, Ethernet, or packaged file system.
Running
The API of BMNet inference engine are needed for programming. Programming flow chart as follow :