Java – opencv’s performance in template matching
I'm trying to do template matching basically on Java I use a direct algorithm to find a match This is the code:
minSAD = VALUE_MAX; // loop through the search image for ( int x = 0; x <= S_rows - T_rows; x++ ) { for ( int y = 0; y <= S_cols - T_cols; y++ ) { SAD = 0.0; // loop through the template image for ( int i = 0; i < T_rows; i++ ) for ( int j = 0; j < T_cols; j++ ) { pixel p_SearchIMG = S[x+i][y+j]; pixel p_TemplateIMG = T[i][j]; SAD += abs( p_SearchIMG.Grey - p_TemplateIMG.Grey ); } } // save the best found position if ( minSAD > SAD ) { minSAD = SAD; // give me VALUE_MAX position.bestRow = x; position.bestCol = y; position.bestSAD = SAD; } }
But this is a very slow approach I tested 2 images (768) × 1280) and sub images (384 x 640) This lasted a long time Does opencv use the ready function cvmatchtemplate() to perform template matching faster?
Solution
You will find that opencv cvmatchtemplate () is much faster than the method you implement You created a statistical template matching method This is the most common and simplest implementation, but it is very slow on large images Let's see if you have a 768 × The basic mathematics of 1280 image, you cycle through each pixel minus the edge, because this is your template limit, so (768 – 384) x (1280 – 640), 384 x 640 = 245 '. Therefore, you can cycle through each pixel of the template (another 245'760 operations) before you add any math (245'760 x 245'760) 60'397'977'600 operations in the cycle More than 60 billion operations are just to cycle your image. What's more amazing is how fast the machine can do this
But remember its 245'760 x (245'760 x mathematical operation), so there are more operations
Now cvmatchtemplate () actually uses Fourier analysis for template matching This works by applying fast Fourier transform (FFT) to the image of signal intensity change constituting pixels and dividing it into each corresponding waveform This method is difficult to explain well, but the image is converted into complex signal representation If you want to know more, please search on the goggles of fast Fourier transform Now, the same operation is performed on the template, and the signal forming the template is used to filter out any other signal in the image
In short, it suppresses all functions in the image that do not have the same function as the template The image is then converted back using the inverse fast Fourier transform to produce a high value for the matched image, while a low value means the opposite This image is often normalized, so 1 represents a match, 0 or about means that the object is not nearby
Warning, if their object is not in the image and it is normalized, error detection will occur because the calculated highest value will be regarded as a match I can continue to talk about the working principle of this method and its possible benefits or problems, but
The reasons why this method is so fast are: 1) opencv is highly optimized C code 2) FFT functions are easy to handle for processors because most people have the ability to do this in hardware GPU graphics cards are designed to perform millions of FFT operations per second because these calculations are as important as high-performance game graphics or video coding 3) The amount of operation required is much less
In summer, the statistical template matching method is slow and needs age, while opencv FFT or cvmatchtemplate () is fast and highly optimized
If an object does not exist, the statistical template matching will not produce an error, and opencv FFT is not noticed in its application
I hope this will give you a basic understanding and answer your questions
Cheers!
Chris
[Edit]
Further answers to your questions:
Hi,
Cvmatchtemplate can use ccoeff_ Normalized and ccorr_ Normalized and sqdiff_ Normed, including non - standardized versions of these Here shows the results you can expect and gives your code
http://dasl.mem.drexel.edu/ ~noahKuntz/openCVTut6. html#Step%202
These three methods are well cited, and many papers can be accessed through Google Scholar I have provided several articles Everyone simply uses different equations to find the correlation between the FFT signal forming the template and the FFT signal existing in the image. The correlation coefficient tends to produce better results in my experience and is easier to find reference The sum of squared differences is another method that can be used with comparable results I hope some of these help:
Fast normalized cross correlation for defect detection, Du mingzai, Chen Linlin, pattern recognition letter, Vol. 24, No. 15, November 2003, pp. 2625-2631
Template matching using fast normalized cross correlation UWE D. hanebeck
Relative performance of two-dimensional speckle-tracking techniques: normalized correlation,non-normalized correlation and sum-absolute-difference Friemel,B.H. Bohs,L.N. Trahey,G.E. Ultrasonics Symposium,1995. Proceedings., 1995 IEEE
A class of algorithms for fast digital image registration Silverman, Harvey F. computer, IEEE Transactions on, February 1972
Using standardized versions of these methods is usually advantageous because anything equal to 1 is a match, but if there is no object, you can get false positives This method works fast because it is initiated in computer language The operations involved are ideal for the processor architecture, which means that it can complete each operation in a few clock cycles, rather than moving memory and information in a few clock cycles The processor has solved the FFT problem for many years, just like the built-in hardware I said Hardware is always faster than software, and the statistical method of template matching is based on basic software Hardware readings can be found here:
Although the reference of the wiki page is worth seeing, it is the hardware that performs FFT calculation
A new approach to pipeline FFT processor is my favorite because it shows what happens in the processor
An efficient locally pipelined FFT processor Liang Yang; Zhang Weiwei, Liu Hongxia, golden; Huang Shitan
These papers really show the complexity of FFT implementation, but the pipeline of the process allows operations to be performed in several clock cycles This is why real-time vision based systems use FPGA (especially when designing processors, you can design and implement a set of tasks), because they can be designed very parallel in the architecture, and the pipeline is easier to implement
Although I must mention that for the FFT of the image, you are actually using FFT 2, which is the FFT of the horizontal plane and the FFT of the vertical plane, so there is no confusion when you find the reference I can't say I have an expert knowledge on how to implement and FFT implementation. I've tried to find a good guide, but it's very difficult to find a good guide. I haven't found one yet (none I can understand) One day I can understand them, but I know I have a good understanding of their working methods and can expect results
In addition, if you want to implement your own version or understand how it works, I can't really help you spend more time in the library, but I warn you that the opencv code is so optimized that you will try to improve its performance, but who knows you may find a way to get better results. Good luck
Chris