This OpenCV tutorial is a very simple code example of GPU Cuda optical flow in OpenCV written in c++. The configuration of the project, code, and explanation are included for farneback Optical Flow method. Farneback algorithm is a dense method that is used to process all the pixels in the given image. The dense methods are slower but more accurate as all the pixels of the image are processed. In the following example, I am displaying just a few pixes based on a grid. I am not displaying all the pixes. In the opposite to dense method the sparse method like Lucas Kanade using just a selected subset of pixels. They are faster. Both methods have specific applications. Lucas-Kanade is widely used in tracking. The farneback can be used for the analysis of more complex movement in image scene and furder segmentation based on these changes. As dense methods are slightly slower, the GPU and Cuda implementation can lead to great performance improvements to calculate optical flow for all pixels o
Binary Convolutional neural network by XNOR.AI
Great idea to save memory and computation by different type number representation. Convolutional neural network are expensive for its memory needs, specific HW and computational power. This simple trick is able to bring this networks to less power devices..
Unique solution ?
I am thinking about this. Let me know in comments how unique is this solution. In automotive industry on embedded HW solutions close to this one already exist.. In convolution layers is this unique, I guess.
In GPU there is different types of registers to be able calculate FLOAT16 faster than FLOAT32 bit representation. This basically brings something like represent Real 32 bits number as Binary number. This bring the 32x memory saving information. Say in other way, They are introduce approximation of Y=WX
This could be input vector x multiply by w weight metrics for each
layer. W and X are real number. Float32 or Float 16. The approximation looks likeY=αβWX
alpha is scalar numbers, beta is matrix and W and X are just binary.Lower precision but maybe enough power for some applications. Instead of approximate already learned network BWN. They bring this into the learning of the network. Compute convolution layers during the forward pass is also much more effective, that less power CPU devices could handle the problem in real time.
Binary representation Binary field in C
I am going more deeper into this technology.. Let me introduce some operation with binary arrays in C.. This is exciting area. Not so special. Maybe you are programming only in higher languages like Java, python and how to set only one particular bit in C should be interesting for you. It is for me.. There is no direct support for this..
int A[2]; when your int is size of 32 bit, You have two integers array.. You can store 64 bits. If you represented image in only black and white is quite limited to represent good features. If you represented this for each color and convolution filters. The representation is not so boring..
One convolution layer should be stored in memory size / 32.
int A[2];
Set first bit of A
setBit(A, 1);
Set fifth bit of A
setBit(A, 5);
void setBit(int *Field,int n) {
Field[n / 32] |= 1 << (n % 32);
}
void clearBit(int *Field, int n) {
Field[n / 32] &= ~(1 << (n % 32));
}
This 1 is like only move this 00001 to the right place by << bit shift. The magic is
based on the conditions in this expression shift 1 Hexa, 00001 B to the right position.
000000000 set 0001 to n = 3 ->000001000.
By this principle is possible to write print of the array and almost all you need.
Resources
Look at the founded company using this technologyhttps://xnor.ai
Check also github
There is some integration into the TORCH, I think also Yolo, darknet but this need some extension for already existing layers.https://github.com/allenai/XNOR-Net
Comments
Post a Comment