MobileNet
MobileNet is a class of efficient light weight deep networks for mobile and embedded vision applications, which have fewer parameters and a relatively lower amount of calculation. Mobilenets use depthwise separable convolutions, which could be divided into depthwise and pointwise convolutions.
Since most of the model’s operations are convolutions, our project focuses on optimizing the related operations of convolutions. Our optimizing method includes: improving the parallelism by assigning the calculation task of each pixel to a thread, avoiding unnecessary memory data handling, and putting operations such as memory application in the model initialization stage as much as possible.
With above optimizations, we went from 2s per inference initially to only 7.7ms per inference.